Just how do you remove void personalities when developing a pleasant url (ie just how do you create a slug)?

Claim I have this page: http://ww.xyz.com/Product.aspx?CategoryId=1

If the name of CategoryId= 1 is "Dogs" I would love to transform the URL right into something similar to this: http://ww.xyz.com/Products/Dogs

The trouble is if the group name has international (or void for a url) personalities. If the name of CategoryId= 2 is "Göra äldre", what should be the new URL?

Practically it needs to be: http://ww.xyz.com/Products/Göra äldre yet it will certainly not function.

To start with as a result of the room (which I can conveniently change by a dashboard as an example) yet what concerning the international personalities? In Asp.net I can make use of the URLEncode function which would certainly offer something similar to this: http://ww.xyz.com/Products/G%c3%b6ra+%c3%a4ldre yet I can not actually claim it's far better than the initial URL (http://ww.xyz.com/Product.aspx?CategoryId=2).

Preferably I would love to create this set yet just how can I can do this instantly (ie transforming international personalities to 'secure' URL personalities) : http://ww.xyz.com/Products/Gora-aldre.

2019-05-07 00:49:47
Source Share
Answers: 7

You can add a new area to the Products table which contained an URL secure and also one-of-a-kind name for each and every item. This could possibly be instantly created originally (replacing non - secure personalities with closest secure equal - gora-aldre?) and afterwards tweaked as required.

Given that the substitute of non - secure personalities is not (constantly) relatively easy to fix, it isn't totally viable to do this example on the fly.

Conversely, you construct the URL thusly :


Where safe-string is developed on the fly changing harmful personalities as required. The number 1234 is the item key. You make use of the key to seek out the item, the 'secure - string' exists more for the customer and also internet search engine.

2019-05-08 20:38:37

It relies on the language you are making use of and also the strategy you intend to make use of. Have a look at this fragment of JavaScript from the Django resource, it does specifically what you require. You can conveniently port it to the language of your selection I presume.

This is the Python fragment made use of in the Django slugify function, it's a whole lot much shorter :

def slugify(value):
    Normalizes string, converts to lowercase, removes non-alpha characters,
    and converts spaces to hyphens.
    import unicodedata
    value = unicodedata.normalize('NFKD', value).encode('ascii', 'ignore')
    value = unicode(re.sub('[^\w\s-]', '', value).strip().lower())
    return re.sub('[-\s]+', '-', value)

I assume every language obtained a port of this, given that it's an usual trouble. Simply Google for slugify+your language.

2019-05-08 20:37:09

I've thought of the 2 adhering to expansion approaches (asp.net/ C#) :

public static string RemoveAccent(this string txt)
    byte[] bytes = System.Text.Encoding.GetEncoding("Cyrillic").GetBytes(txt);
    return System.Text.Encoding.ASCII.GetString(bytes);

public static string Slugify(this string phrase)
    string str = phrase.RemoveAccent().ToLower();
    str = System.Text.RegularExpressions.Regex.Replace(str, @"[^a-z0-9\s-]", ""); // Remove all non valid chars          
    str = System.Text.RegularExpressions.Regex.Replace(str, @"\s+", " ").Trim(); // convert multiple spaces into one space  
    str = System.Text.RegularExpressions.Regex.Replace(str, @"\s", "-"); // //Replace spaces by dashes
    return str;
2019-05-08 20:06:24

Two points to remember :

  1. URL rewriting usually does not have a favorable result on internet search engine (and also regularly an adverse one) - - so you need to just do it if you recognize of a quantifiable favorable result on customer contentment (and also as necessary : make your URLs valuable for the customers).

  2. If you do determine to do URL rewriting, you have to have the technological information down flawlessly. As an example, you need to never ever have greater than one one-of-a-kind URL revealing the very same web content. See to it you make use of UTF - 8 for the encoding of non - ASCII web content, usage ran away web links within your web content, and also usually examination on numerous internet browsers to see to it points function as planed. If any one of this is international to you, after that I would highly advise refraining URL rewriting for the minute.

FWIW Some of the internet search engine side concerns are covered at http://googlewebmastercentral.blogspot.com/2008/09/dynamic-urls-vs-static-urls.html

2019-05-08 14:50:37

The ideal method IMO is to whitelist personalities as opposed to attempting to seek void personalities. Nonetheless, accented personalities like é are rather usual (and also your URL will certainly be weird without them) so you can transform these first.

In PHP you can make use of the strtr function, yet you need to have the ability to change this for your demands on asp.net :


Currently below's your procedure :

  1. [optional ] Convert the string to lowercase (generally advised for URLs).
  2. [optional ] Convert the accented personalities making use of the above mapping.
  3. Go through your input string personality - by - personality.
  4. It might be much faster to do # 1 and also # 2 per - personality as opposed to overall string, relying on what constructed - in features you have.
  5. If the personality remains in the array a - z or 0 - 9 , add it to your new string, or else :
    a) If you currently have a hyphen on completion of your new string, overlook it
    b) If not, add a hyphen throughout of the string.
  6. When you reach completion, remove and also leading or routing hyphens and also you're done!
2019-05-08 14:49:30

Wikipedia usually make use of non - latin1 personalities in their URLs. There is no factor (past your webserver not sustaining them) that you should not make use of these URLs.

Nonetheless ; If you need to stay clear of these personalities, I have actually located that changing them with their non - diacritic kind. Most individuals that read these can inform (from context) what words is intended to be despite the fact that the diacritics have actually been gotten rid of.

2019-05-08 14:10:55

Since you upload is marked ASP.Net : look at this site, it has example code to change (most) message with diacritics (void personalities you call them) with their base personality.

As Kris has actually stated, make use of one-of-a-kind ID in your url, similar to this website does. If you have no control over the ID's given to you, you need to create a translation table, which contains your one-of-a-kind ID, with the exterior one-of-a-kind ID's. In this way your inner referrals are additionally excellent when the exterior ID's adjustments. Along with your one-of-a-kind ID, you store your "Search and also Human maximized ID", the one that is not so one-of-a-kind, yet looks excellent.

2019-05-08 10:51:43