Fix the comparison of strings with Unicode Latin Extended-D characters
Could you please improve the handling of Unicode Latin Extended-D characters in MS Access queries and filters. Currently, chars in this range (introduced in Unicode 5.1 back in 2008) are incorrectly compared. Thus SELECT (ChrW(42786)=ChrW(42841)) AS Expr1; falsely returns a True value although two different chars are compared. The same is true for any two signs in this Unicode range. The problem only exists in Queries; in VBA Code these strings are compared correctly (except the StrComp function with vbDatabaseCompare option). See MSDN c76d9e5f-adbc-41f3-a2ed-dc25edf34a14
Alex F. Ilin commented
All other problems reported in the comments below are solved when changing the sort order from "General - Legacy" to "General". (by selecting "New database sort order" and running "Compact & Repair Database" on my database). So all problems with emojis, Javanese, Khmer, Musical Symbols and Greek Extended are easily solvable.
However, the original problem reported by the topic-starter persists even with the General sort order (and with other sort orders I tried).
SELECT (ChrW(42786)=ChrW(42841)) AS Expr1 falsely returns a True value although two different chars are compared.
This is the same as SELECT ("Ꜣ"="Ꜥ") AS Expr1
Fred Chap commented
Not only Unicode 5.1 is not support. The problem goes deeper to Unicode blocks added in Unicode 4.0 (April 2003) such as Khmer Symbols: SELECT "᧤"="᧻" AS Expr1 returns True, should be False.
Same with Tagalog (Unicode 3.2, March 2002): SELECT "ᜅ"="ᜇ" AS Expr1
Same with Musical Symbols and Gothic (Unicode 3.1, March 2001): SELECT "𝄬"="𝄠" AS Expr1, "𐌱 "="𐌵" AS Expr2, "𝄬 "="𐌵" AS Expr3, "a "="b" AS Expr4
Of the Unicode Blocks added in Unicode 3.0 (September 1999) some are supported (Myanmar), some are not (Cherokee, Runic, Syriac ).
Even Greek Extended, added in Unicode 1.1 (June 1993) is still not supported by Access: SELECT "ἂ"="ὡ" AS Expr2
Please add support for all Unicode script or give the users an ability to create custom character comparison tables.
Oleg Pchelkin commented
Also affected: Javanese script. Run four queries.
SELECT 'ꦫ' AS SIGN INTO JAVANESE ;
INSERT INTO JAVANESE ( SIGN ) SELECT 'ꦗ' AS SIGN;
INSERT INTO JAVANESE ( SIGN ) SELECT 'ꦣ' AS SIGN;
SELECT SIGN FROM JAVANESE WHERE SIGN = 'ꦫ'
The last query returns three records instead of one, because for microsoft access all javanese signs are the same. (SELECT 'ꦫ' = 'ꦣ' as COMPARISON_RESULT;
returns true )
You can also compare Javanese signs with emoji:
(SELECT 'ꦧ' = '😦' as COMPARISON_RESULT;
returns true ).
Note that VBA is not affected, only the database engine
Peter Steinhardt commented
Yes, please add support for emojis to the database engine.
Alexander Ilin-Tomich commented
Same problem with emojis. Microsoft Access Jet does not distinguish different emojis. Try creating a new table with an emoji field with values like 😖 , 😢, 😭, 😦 and then make a query to select only 😦 . Access will return all records instead of just 😦