Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> who really cares if “æ” is written as “ae”?

Nitpicking, but if you're writing about text rendering you should know:

Yes, ligatures are really about presentation and not semantics. For example (U+FB01) means the same thing as fi; it just looks neater in some situations.

æ (U+00E6) is not a ligature; it's a mostly obsolete character, with different semantics (or phonetics) than ae.

For example, for purely typsetting beauty, your word processor might substitute the ligature for the two letters fi (which can f* search, and I resent both the ligature and lazy search function developers). It would never substitute æ for ae; that would misspell the word as much as substituting an o.





> æ (U+00E6) is not a ligature; it's a mostly obsolete character, with different semantics (or phonetics) than ae.

Reading that a letter in my alphabet is mostly obsolete feels really weird. No rebuttal, just a comment.

> It would never substitute æ for ae; that would misspell the word as much as substituting an o.

While that is correct, a lot of other systems actually do this exact substition. If your name contains æ it will be substituted with ae in passports, plane tickets and random other systems throughout your life.

My own username on this website is an example of a similar substition. The oe should be read as the single character ø.


> Reading that a letter in my alphabet is mostly obsolete feels really weird. No rebuttal, just a comment.

Sorry, I should have specified 'in English'.

> a lot of other systems actually do this exact substition. If your name contains æ it will be substituted with ae

I agree and to clarify, I meant that the reverse substitution doesn't happen.


> I agree and to clarify, I meant that the reverse substitution doesn't happen.

Re-reading your comment, yeah its obvious that that was what you meant. My apologies, that’s on me.


No problem at all!

Languages often simplify as they evolve, dropping "annoying" characters like æ. In fact, it was replaced by "e" (or ae itself) in most cases as the words got imported by other languages.

A personal hypothesis is that additional characters were much simpler in the age of handwriting, most of the history of literacy, compared to the age of print, the current age.

Using handwriting, additional characters are simple and in fact Medieval European scribes used many abbreviations, etc. When you need to set type on a printing press, or even input a character not already on your computer keyboard, the barrier is higher.


I hope that the implication is that æ is obsolete in English. Because it is used in English!

It's mostly obsolete in English, which I think is safe to say and which does not conflict with it being used. For example, I think few people know how to type it into a computer, while everyone who uses a Latin alphabet can type ae.

To be fair I think most English speakers are unfamiliar with how to type most accents. But it appears to me that æ is available as long press "accented character" just like é and ë, also used in English, so they are equally reachable on mobile phone.

Yes, good point.

> _lazy_ search function developers

doing non-ascii first needs awareness and then quickly becomes tricky (encodings yay).

getting combining characters and/or homoglyphs right is hard.

and if you're still bored out: have fun with Unicode confusables.txt ...

with this in mind I dare to give them lazy bums the honor of the doubt and rather call them something between naïve and scared.


ok, fine. :)

Isn't there a library out there for this common set of problems? I know Unicode provides normalization tables, though I don't know how good they are and I don't know if Unicode also provides a library.


Nitpicking your nitpicking: I think the author meant better.

The "ae" example was used as an introductory example for us English readers. Unlike the Arabic examples where ligatures are mandatory and supported by most Arabic fonts, not many English fonts have an "ae" ligature these days. Not to mention this is a web page and a user can freely apply their !important font styles.

Using æ to mean "treat it as an 'ae' rendered by ligature which is visually indistinguishable" does not mean the author knows nothing about this (although the wording can use some improvement to reduce the ambiguity).


I don't understand: æ is not a ligature so it's not an example of a ligature. There are English ligatures to use.

Also, most fonts have many characters beyond ASCII, including æ. If your font lacked it then you would see an empty box, not the two letters ae. Applying a font style would not change the rendering of æ into ASCII letters; I don't think it changes the rendering of English ligatures, which are separate code points in Unicode.


> mostly obsolete

The Nordic languages beg to differ!


Keep Swedish out of this, you dirty Danes!

Edit: Checked out your profile, correcting myself: "you silly north-Danes!"


Yes, sorry, I should have said, 'in English'.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: