Does MLS support an extended character set?

Discussions around using and interfacing with the Church MLS program.
RossEvans
Senior Member
Posts: 1346
Joined: Wed Jun 11, 2008 8:52 pm
Location: Austin TX
Contact:

Does MLS support an extended character set?

Postby RossEvans » Thu Jun 25, 2009 11:28 am

What character set is supported in MLS?

I naively thought it was confined to vanilla ASCII, but today I noticed the following string within the Full Name, Maiden Name and Preferred Name fields for a recent move-in:

Jéssica

Note the second character, which is a hex E9. It also is present in the Membership.csv export. I did not know that support for such characters was built into MLS. (I knew that Windows supports this larger character set, but I did not know all those characters were being used in the application.) Because all three fields have the extended character, I assume this data was downloaded from CHQ as part of a send-receive transaction.

If an extended character set -- I assume it would be the Latin 1 set, the native Windows 1252 code page -- is supported, how can clerks enter extended characters into MLS?

We have some members with names such as Castaño [font=Arial][size=84][font=Verdana]or [/font]Nuñez[/SIZE][/font], which are routinely mangled to Castano [font=Arial][size=84][font=Verdana]or [/font]Nunez. [/SIZE][/font]If we can fix that, I'd like to know how.

User avatar
aebrown
Community Administrator
Posts: 14693
Joined: Tue Nov 27, 2007 8:48 pm
Location: Sandy, Utah

Postby aebrown » Thu Jun 25, 2009 12:13 pm

boomerbubba wrote:What character set is supported in MLS?

I naively thought it was confined to vanilla ASCII, but today I noticed the following string within the Full Name, Maiden Name and Preferred Name fields for a recent move-in:

Jéssica

Note the second character, which is a hex E9. It also is present in the Membership.csv export. I did not know that support for such characters was built into MLS. (I knew that Windows supports this larger character set, but I did not know all those characters were being used in the application.) Because all three fields have the extended character, I assume this data was downloaded from CHQ as part of a send-receive transaction.

If an extended character set -- I assume it would be the Latin 1 set, the native Windows 1252 code page -- is supported, how can clerks enter extended characters into MLS?

We have some members with names such as Castaño [font=Arial][size=84][font=Verdana]or [/font]Nuñez[/SIZE][/font], which are routinely mangled to Castano [font=Arial][size=84][font=Verdana]or [/font]Nunez. [/SIZE][/font]If we can fix that, I'd like to know how.


Since MLS is a Java application, I would think it likely that it supports Unicode, which of course has far more characters than Latin-1. This conjecture is supported by the fact that MLS is available in a wide variety of languages, including many European languages, as well as Korean, Japanese, and Chinese, which use double-byte ideographs. For some of those characters you probably need specific system fonts installed, but for most characters, you should be able to simply enter the characters using any of the normal methods.

For those unfamiliar with the Windows techniques for entering non-ASCII characters, here are some options:

  1. Use the Character Map accessory (Start > Programs > Accessories > System Tools > Character Map), which lists all the characters. It's helpful to choose the Arial Unicode MS font. Then you can easily copy and paste characters from the Character Map into any application. This is the easiest way for most people. You have to do a bit of hunting to find some of the characters, but they're all there.
  2. If you know the code, you can just hold down the Alt-key, type the number using the number pad, then release the Alt-key. I do this all the time for characters I know, such as ñ (164), ä (132), ö (148), ü (129), ß (225), á (160), é (130), í (161), ó (162), ú (163), ç (135), Ç (128).
  3. Switch the keyboard to an appropriate international keyboard, using Regional and Language Settings on the Control Panel. But if you're not used to international keyboards, this could really throw you for a loop, since many of the keys will produce letters you don't expect.
All the above works fine with MLS. Ideographs don't display properly on my system, but I imagine that's just because I don't have the right system fonts.

RossEvans
Senior Member
Posts: 1346
Joined: Wed Jun 11, 2008 8:52 pm
Location: Austin TX
Contact:

Postby RossEvans » Thu Jun 25, 2009 1:08 pm

Thanks. I had forgotten about the standalone Windows utility for character-selection. My own lazy practice on most computers has been to use the Insert -> Symbol utility built into MS Word. I now see that OpenOffice Writer also has a similar function. But the Windows utility is a better choice.

Alan_Brown wrote:If you know the code, you can just hold down the Alt-key, type the number using the number pad, then release the Alt-key. I do this all the time for characters I know, such as ñ (164), ä (132), ö (148), ü (129), ß (225), á (160), é (130), í (161), ó (162), ú (163), ç (135), Ç (128).


Careful. If you populate your wetware RAM with stuff like octal values for obscure characters, there may not be memory left for useful trivia from history, sports or Hollywood.:)

This has nothing to do with MLS directly, but I will add a word of caution to anyone processing text files such Membership.csv if those files contain non-ASCII characters. If you use FTP to transfer files, avoid what FTP calls "text" or "ascii" mode. FTP is a very old protocol, and non-ASCII characters can get mangled badly. (The letter ñ[font=Verdana][size=84] will probably not becomen. It might become a control character!) Also [/SIZE][/font]make sure that applications you use downstream, such as database products, are configured to support the full set of characters in your data.

russellhltn
Community Administrator
Posts: 20778
Joined: Sat Jan 20, 2007 2:53 pm
Location: U.S.

Postby russellhltn » Thu Jun 25, 2009 1:51 pm

boomerbubba wrote:Careful. If you populate your wetware RAM with stuff like octal values for obscure characters, there may not be memory left for useful trivia from history, sports or Hollywood.:)


Just one of the occupational hazards of always chipping in your 2(Alt-155)¢ :D

Just as a note, there are two sets of Alt characters - 3 digit and 4 digit. So Alt-0162 and Alt-162 are different things. Below is a partial list of the 3 digit ones.

128 Ç
129 ü
130 é
131 â
132 ä
133 à
134 å
135 ç
136 ê
137 ë
138 è
139 ï
140 î
141 ì
142 Ä
143 Å
144 É
145 æ
146 Æ
147 ô
148 ö
149 ò
150 û
151 ù
152 ÿ
153 Ö
154 Ü
155 ¢
156 £
157 ¥
158 ₧
159 ƒ
160 á
161 í
162 ó
163 ú
164 ñ
165 Ñ
168 ¿
Have you searched the Wiki?
Try using a Google search by adding "site:tech.lds.org/wiki" to the search criteria.

User avatar
mkmurray
Senior Member
Posts: 3241
Joined: Tue Jan 23, 2007 9:56 pm
Location: Utah
Contact:

Postby mkmurray » Thu Jun 25, 2009 2:15 pm

My favorite character is the Icelandic Thorn (Alt+0254) þ

We use it at work as a string delimiter for everything. :)

Plus it is a great "tongue-sticking-out-mouth" for emoticons! =0þ
Many questions are already answered on the LDSTech wiki. Check it out!


Return to “MLS Support, Help, and Feedback”

Who is online

Users browsing this forum: No registered users and 1 guest