|
|
Subscribe / Log in / New account

Schrödinger's 😻 and outside-the-box naming

Schrödinger's 😻 and outside-the-box naming

Posted Apr 5, 2013 9:22 UTC (Fri) by epa (subscriber, #39769)
In reply to: Schrödinger's 😻 and outside-the-box naming by hamjudo
Parent article: Schrödinger's 😻 and outside-the-box naming

Is there some officially defined subset of Unicode which includes characters needed for writing but excludes cats, PILE OF POO and so on?


to post comments

Schrödinger's 😻 and outside-the-box naming

Posted Apr 8, 2013 15:02 UTC (Mon) by hamjudo (guest, #363) [Link]

There are a variety of Unicode regular expression libraries. Characters are in classes, alphabetic, currency symbols, numerals,uppercase, lowercase, etc... Companies running search engines will use their own rules (aka. business logic) on what they choose to index for each part of a document, possibly dependent on type of document. Some may recognize source code, and then use programming language specific rules for indexing symbols (C is case sensitive, Fortran is not).

If enough people start using the cat characters in semantically significant ways, there will be a business case for entities to index those characters.

Schrödinger's 😻 and outside-the-box naming

Posted Apr 18, 2013 15:12 UTC (Thu) by mirabilos (subscriber, #84359) [Link]

The BMP (Basic Multilingual Plane) is mostly decent.

Best thing, it fits into 16 bit (0‥FFFD inclusive) and its UTF-8 form uses only up to three octets.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds