Uploaded image for project: 'OpenMRS Core'
  1. OpenMRS Core
  2. TRUNK-227

Constraint error when recreating the Concept word index with concept_names containing accented characters.

    XMLWordPrintable

    Details

    • Complexity:
      Medium
    • Development:

      Description

      Example, two separate Concept Words, one el and the other él (with accent over e)_ gets interpreted as duplicate Concept Words.

      In Spanish and French, for purposes of indexing, these words should actually be treated the same. (In any Spanish or French dictionary, accented characters are alphabetized just the same as their accent-free base letter.)

      So:

      1. When building the unique concept word list, accents should be stripped.
      2. When searching for concepts via concept words, accents should be stripped from the search terms.

      From a native Spanish speaker:
      "Ignore the accents the majority of people don't know how to use them correctly. Plus that's how indexes are in Spanish."

      FYI, in Spanish, all accented vowels should be treated as the plain version of those vowels. ñ is actually its own letter and should be treated the same as n. I don't know how this applies to other languages, but as a first pass I think we can just map all accented vowels to their plain form.

        Gliffy Diagrams

          Attachments

            Activity

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              bmckown Brian McKown [X] (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: