Results 1 to 8 of 8

Thread: [Solved] LEN() value of unicode charater string

  1. #1
    Junior Member
    Join Date
    Sep 2012
    Posts
    4

    [Solved] LEN() value of unicode charater string

    Hi, I am an amateur programmer considering a transition from VFP to Lianja. As one of my compatibility test, I am trying to import a VFP table with Korean character fields into embedded Lianja database table. First I tried appending directly from VFP table using 'append from' command. The result was full of question marks instead of Korean characters. Then I copied the VFP table to a csv file, which by default is an ANSI file, and converted it to a Unicode csv file, line by line, using a third party editor. When I appended records from this Unicode file, the Lianja table showed Korean characters correctly. But I have a problem: it seems that Lianja and VFP calculate the lengths of Korean character strings differently. In VFP the Len() functiion for one Korean character returns 2, while Lianja's Len() function returns 3. A long Korean character string saved in VFP column of 254 characters is truncated when imported into Lianja column of the same length. How can I solve this problem?
    Last edited by barrymavin; 2014-12-23 at 21:30.

  2. #2
    Lianja Development Team barrymavin's Avatar
    Join Date
    Feb 2012
    Location
    UK, USA, Thailand
    Posts
    5,776
    VFP does not support unicode/UTF-8. It uses windows specific code pages which is discouraged now.

    Lianja uses Unicode/UTF-8 to represent data if the command line switch --locale utf-8 is specified on the desktop shortcut. Lianja will auto-detect many locales that use double byte characters and automatically use Unicode/utf-8 internally.

    The standard character encoding for web browsers is UTF-8.

    If you want to import your VFP data that you have stored via a codepage you must add --codepagedata to the desktop shortcut for Lianja App Builder and Lianja App Center.

    Lianja/VFP has a set of functions that operate on double byte characters. LENC() being one of them. There is a list on the development roadmap that we have implemented. These are:

    LENC(), LEFTC(), RIGHTC(), AT_C(), RAT_C(), SUBSTRC(), CHRTRANC(), and STUFFC()

    These are the correct functions to use when working with Unicode/UTF-8 characters.

    UTF-8 is the standard for double byte character sets and Lianja stores and retrieves data in UTF-8 format so as to make your data interoperable across desktop and web apps. It also is code page independent so you can display Korean, Chinese, Japanese, english and other characters without any concern for collating sequences or code pages.

    One of the great advantages of UTF-8 is that the characters are represented in the correct collating sequence order. The disadvantage is that it uses more space to represent the characters.

    When building Web and Mobile Apps no matter what locale the user is in that is working with your App and Data then the characters will be correctly displayed.

    So to summarize.

    LENC() will return 1 for a Korean character NOT 2 and not 3.

    LEN() will return 3 as this is the UTF-8 character encoding for a Korean character.

    So if you want to perform string manipulation use the functions listed above.
    Last edited by barrymavin; 2014-09-15 at 01:36.
    Principal developer of Lianja, Recital and other products

    Follow me on:

    Twitter: http://twitter.com/lianjaInc
    Facebook: http://www.facebook.com/LianjaInc
    LinkedIn: http://www.linkedin.com/in/barrymavin

  3. #3
    Junior Member
    Join Date
    Sep 2012
    Posts
    4
    Thank you for your prompt reply. Unicode support is a welcomed feature, but I'm afraid I have to store many of VFP double-byte charater fields in Lianja memo fields because of limitations in field length. The maximum length of index key is also a problem to me. Do you have palns to increase the lengths of field and index key?

  4. #4
    Lianja Development Team barrymavin's Avatar
    Join Date
    Feb 2012
    Location
    UK, USA, Thailand
    Posts
    5,776
    I may look at extending the maximum character field length to make it better for UTF-8 character encoding at some point. This should have no affect on existing data.

    The maximum length of an index key is more problematic as it would require all existing indexes to be recreated and many users have live Apps running now.
    Principal developer of Lianja, Recital and other products

    Follow me on:

    Twitter: http://twitter.com/lianjaInc
    Facebook: http://www.facebook.com/LianjaInc
    LinkedIn: http://www.linkedin.com/in/barrymavin

  5. #5
    Junior Member
    Join Date
    Sep 2012
    Posts
    4
    It's so nice to see you really care the user's problem. If the maximum length of index key is a concern, wouldn't it be better to take care of it before too many users go live?

  6. #6
    Lianja Development Team barrymavin's Avatar
    Join Date
    Feb 2012
    Location
    UK, USA, Thailand
    Posts
    5,776
    The problem is many users have gone live and some of them have huge amounts of data. I don't think they would appreciate a new version that forced them to recreate all their indexes. I will investigate and see if there is a way of maintaining backwards compatability.
    Principal developer of Lianja, Recital and other products

    Follow me on:

    Twitter: http://twitter.com/lianjaInc
    Facebook: http://www.facebook.com/LianjaInc
    LinkedIn: http://www.linkedin.com/in/barrymavin

  7. #7
    Junior Member
    Join Date
    Sep 2012
    Posts
    4
    Thank you. It was a very fruitful exchange of communications.

  8. #8
    Lianja Development Team barrymavin's Avatar
    Join Date
    Feb 2012
    Location
    UK, USA, Thailand
    Posts
    5,776
    In v1.2.4 if an index key is greater than the maximum length supported the left() of it is used. This will work in most cases.
    Principal developer of Lianja, Recital and other products

    Follow me on:

    Twitter: http://twitter.com/lianjaInc
    Facebook: http://www.facebook.com/LianjaInc
    LinkedIn: http://www.linkedin.com/in/barrymavin

Tags for this Thread

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Journey into the Cloud
Join us