Uploaded image for project: 'OpenMRS Core'
  1. OpenMRS Core
  2. TRUNK-227

Constraint error when recreating the Concept word index with concept_names containing accented characters.

    XMLWordPrintable

    Details

    • Complexity:
      Medium

      Description

      Example, two separate Concept Words, one el and the other él (with accent over e)_ gets interpreted as duplicate Concept Words.

      In Spanish and French, for purposes of indexing, these words should actually be treated the same. (In any Spanish or French dictionary, accented characters are alphabetized just the same as their accent-free base letter.)

      So:

      1. When building the unique concept word list, accents should be stripped.
      2. When searching for concepts via concept words, accents should be stripped from the search terms.

      From a native Spanish speaker:
      "Ignore the accents the majority of people don't know how to use them correctly. Plus that's how indexes are in Spanish."

      FYI, in Spanish, all accented vowels should be treated as the plain version of those vowels. ñ is actually its own letter and should be treated the same as n. I don't know how this applies to other languages, but as a first pass I think we can just map all accented vowels to their plain form.

        Attachments

          Activity

            People

            Assignee:
            Unassigned
            Reporter:
            bmckown Brian McKown [X] (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: