Remove accents and diacritical marks from latin letters in Delphi

When working on datasets in multiple languages or in a latin language other than English, you may encounter string values with accented letters (like : à, é, è, ï, …).

If your database controls have unicode support with no settings to tolerate accents, then filtering records can be kind complicated to find records containing diacritics and accented letters, as most users won’t pay attention when writing the filter pattern.

Unless your database engine has collation support accent sensitivity feature, then you should set a separate field containing an ASCII version of the indexing field’s content.

In Delphi (and also FPC/Lazarus) you can implement a function for getting text without accents, that you can use for filling the ASCII field and for sanitizing the filter patterns.

This function is based on the TEncoding.Convert class function from “System.Sysutils” Delphi unit (“Sysutils” in FPC).

It is working on both IDEs as the local variable’s type is “TBytes” (array of Byte), however you can use the generic dynamic array “TArray<Byte>”, but it works only on Delphi.

function NoAccent(const Source: string): string;
var
K: TBytes;
//K: TArray<Byte>; {Works only on Delphi}

begin
K := TEncoding.Convert(TEncoding.Unicode, TEncoding.ASCII, TEncoding.Unicode.GetBytes(Source));
Result := StringOf(K);
end;

We can call this function like the following :

NoAccent('Délphï');

The snippet above will return the string ‘Delphi’ :

See also

Leave a Reply