Accented Characters

Do you want to see something in X1? Do you dislike something about X1? Let us know!

Moderator: Mods

Accented Characters

Postby FORTUETA » Wed Sep 26, 2007 2:11 am

Hello!!!

The people that use a language with accented characters has a problem with X1 since version 5.2.3

"Habitación" and "Habitacion" should be the same word for X1 (since version 5.2.3 are different words).

I can´t buy X1 for my company if it isn´t work fine!!!!

I think that a large part of the non-english users of X1 need this fix.

Please, repair it!!!!.
(I´m still using 5.2.3 version)
FORTUETA
 
Posts: 27
Joined: Wed Sep 26, 2007 2:03 am

Postby w0qj » Wed Sep 26, 2007 6:43 am

If anyone else feels this feature (Unicode support searching ) is needed, please feel free to contribute to this discussion thread!

[This discussion moved to Feature Request Forum]
Rgds / Mr. Wong

"It is what X1 can do with the information found that is important."
w0qj
X1 Guru
X1 Guru
 
Posts: 1183
Joined: Wed Jun 16, 2004 3:53 am
Location: Hong Kong

Postby Tod » Wed Sep 26, 2007 11:47 am

To clarify - X1 does actually support some Unicode indexing and querying. What the original poster is requesting is that X1 actually strip out accents when indexing so that Habitación and Habitacion are both returned as the result of a the same query.

Some users have reported this as a bug and insisted that X1 properly index accents. Please provide your response to this - should X1 index and therefore require you to query with accents?
Tod
X1 Rep
X1 Rep
 
Posts: 178
Joined: Thu Apr 07, 2005 8:26 pm
Location: Pasadena, CA

Postby gt13 » Sun Oct 07, 2007 11:35 am

This feature is not only needed, but it is mandatory for every people using accented characters.

The reason is that accented letters are almost written randomly in files. It is due:
- to the fact that old text (ASCII) files were not accented
- to the fact that accented characters are more difficult to type on the keyboard
- to the fact that uppercase letters are generally not accented
- to the fact that everybody understands the text even if it is not accented
- to the fact that very often people do not know the correct writing and the kind of accentuation
- taking "e" for example, in French, you can find 10 characters: e é è ê ë E É È Ê Ë
- same problem for "a", "i", "o", "u", c ç C Ç

It means that it is IMPOSSIBLE to find accentuated words if the search engine makes a difference between all these characters.

Let us take an example: suppose that you are searching for the word "fenêtrées".
There are 4 "e", and even if you suppose that the only possible accented ones are the second and the third, and that reasonably the person who wrote it could only have used (e é è ê), you have 4x4=16 searches to do!
And if you have a search with several accented words :evil:?

Presently, search engines do not make the difference between lowercase and uppercase characters. Why isn't it possible to define equivalences between several characters instead of 2? In the same way as (e E) are equivalent, it should be possible to decide that (e é è ê ë E É È Ê Ë) are equivalent in a search process.

Of course, for some purposes, it would also be useful to desactivate this feature, and make a distinction between all the characters.

This is the point of view of the user.
OK, it is perhaps not easy to implement, but it ought to be.
And as long as it is not, this software cannot be used in French, Spanish, German, northern Europe, and many other... languages.

This is the reason why many of us stay to X1 version 5.2.3 (Build 1852bz-bs) (Released Friday, August 26, 2005, that means more than 2 years ago!), which works fine from this point of view.
LATER EDIT (Feb. 24th, 2008): in fact, this assumption seems not exact: following posts in this thread show that some diacritics give problem.

And it is not the first time that this problem is pointed out: see for instance point ii) in http://forums.x1.com/viewtopic.php?p=5762

Gerard
Last edited by gt13 on Sun Feb 24, 2008 1:45 pm, edited 4 times in total.
gt13
X1 Power User
X1 Power User
 
Posts: 64
Joined: Sat Apr 17, 2004 10:09 am
Location: Marseille, France

Postby bluegecko » Tue Oct 09, 2007 1:17 pm

I work extensively with multilingual documents, and must say I agree with Gerard. The ideal solution would be to offer an option (check box) beside the search field to "ignore accents".

Two simple test cases:

1. Searching for the French word for tea, "thé", obviously I'd not want to ignore accents, otherwise I'd get a list of virtually every file on my hard disk...

2. On the other hand, transliterations (for instance from Arabic) will often have accents all over the place, with all manner of different schemes, some valid, some not, plus typos and each author's foibles coming into play. For example, Mauritania's main port town is Nouadhibou, also spelled Nouâdhibou, Nouâdhiboû, Nouádhibou, etc etc, plus the same using macrons instead of circumflexes. Needless to say, to find Nouadhibou, I'd need the option to ignore accents.

Thanks
bluegecko
 
Posts: 1
Joined: Tue Oct 09, 2007 1:06 pm

How much people in the world has this problem?

Postby FORTUETA » Wed Jan 09, 2008 1:57 am

Hi!!!.
Thank you eveybody for you answers.

Gerard, you make a very good explanation of the problem.

I agee with bluegecko with the solution, a check box with "ignone fu* accents"

Does X1 someday answer to this problem? will fix it?

How much peolpe in the world has this problem? 30% of the world population? 1.800.000.000 people?

I have been waiting for years to have the problem fixed.
FORTUETA
 
Posts: 27
Joined: Wed Sep 26, 2007 2:03 am

Postby pgk » Wed Jan 09, 2008 6:04 am

Looks like everybody agrees. So: index accented versions of the same letter as that letter, and put an option somewhere in the options dialog that allows users to turn on "turn on indexing of accented characters", with Gerard's response as a one-time pop-up explaining the risks. Would that work? -pgk
pgk
X1 Super User
X1 Super User
 
Posts: 757
Joined: Thu Dec 18, 2003 8:13 am

Postby pgk » Wed Jan 09, 2008 3:56 pm

Just posted elsewhere, this is pretty much how this has been implemented in vista. Indexing accents is turned off by default, but can be turned on if needed. The option is called "Treat similar words with diacritics as different words", and a warning about reindexing is given if the option is changed. No info is given on the risk of not finding 'mis-accented' words though. -pgk
pgk
X1 Super User
X1 Super User
 
Posts: 757
Joined: Thu Dec 18, 2003 8:13 am

Postby cwiekol » Mon Jan 28, 2008 9:46 am

Hi there,
I'm new X1 user and ready to buy new 6.0 version, it's impressive in comparison with copernic desktop search but problem with diacritic letters makes your search tool completly unusable for Poles (not only Poles i suppose). In my language characters: e ę E Ę, oóOÓ, aąAĄ, sśSŚ, lłLŁ, zżźZŻŹ, cćCĆ, nńNŃ should means the same for search engine.
Please, let me know when can I expect you fixed this problem. Now I will try to find some older release, as my predecessors show, but i don't want pay for it....
regards
Michal
cwiekol
 
Posts: 24
Joined: Mon Jan 28, 2008 9:12 am

Postby pgk » Mon Jan 28, 2008 10:48 am

Tod, given that X1 has done both in different versions, it should be a piece of cake to leave this as a 'set once' user option. Have you put this on the roadmap? Given that it affects entire nations, seems like it should be reasonably high on the list! -pgk
pgk
X1 Super User
X1 Super User
 
Posts: 757
Joined: Thu Dec 18, 2003 8:13 am

Postby cwiekol » Sat Feb 02, 2008 7:01 am

unfortunately older versions (i've got 5.2.18..) have this same bug.
i have to try micosoft desktop search (sic!)..
i willtry in the future your product to find out if you fixed this problem...
regards
Michal
cwiekol
 
Posts: 24
Joined: Mon Jan 28, 2008 9:12 am

Postby cwiekol » Sat Feb 02, 2008 12:23 pm

Image

This is what we really need, and what only, i think, Microsoft's WDS has...
cwiekol
 
Posts: 24
Joined: Mon Jan 28, 2008 9:12 am

Postby gt13 » Sun Feb 03, 2008 12:43 pm

If you want to go back to a free version without this "bug", go on the page below, and follow the light green instructions (there are some lines in English in the middle of the French ones) to get Yahoo! Desktop Search version 1.2 Build 1852je, which is almost identical to X1 Version 5.2.3 (Build 1852bz-bs)(Released Friday, August 26, 2005).
That is to my knowledge one of the best free Desktop Search, and it has no problem with diacritics.
http://snipurl.com/2iuez
Last edited by gt13 on Sat Dec 06, 2008 10:50 am, edited 1 time in total.
gt13
X1 Power User
X1 Power User
 
Posts: 64
Joined: Sat Apr 17, 2004 10:09 am
Location: Marseille, France

Postby gt13 » Sun Feb 03, 2008 12:52 pm

I just propose something to test your Desktop Search softwares.
The result is here (Excel file): http://snipurl.com/7fugo
And you can get instructions and the files in order to do the same here (zip file): http://snipurl.com/7fumt

I will add the results of Copernic Desktop Search soon.
Last edited by gt13 on Sat Dec 06, 2008 12:13 pm, edited 2 times in total.
gt13
X1 Power User
X1 Power User
 
Posts: 64
Joined: Sat Apr 17, 2004 10:09 am
Location: Marseille, France

Postby pgk » Sun Feb 03, 2008 6:22 pm

Wow, that's a pretty extensive test! And I'm ashamed to say I never knew that Excel could make slanted column headings. Nice. -pgk
pgk
X1 Super User
X1 Super User
 
Posts: 757
Joined: Thu Dec 18, 2003 8:13 am

Postby gt13 » Sun Feb 03, 2008 7:17 pm

I just updated the test with some more features and Copernic results
gt13
X1 Power User
X1 Power User
 
Posts: 64
Joined: Sat Apr 17, 2004 10:09 am
Location: Marseille, France

Postby cwiekol » Mon Feb 04, 2008 4:25 am

[gt13]

thanks for help, i've installed this but it doesn't work for me.. still words "POZWOLIŁEM" is different form "POZWOLILEM". I will have to use windows desktop search till X1 will fix this problem..
I think that many people don't even know that this problem exists (I didn't till last week)..

regards
Michal
cwiekol
 
Posts: 24
Joined: Mon Jan 28, 2008 9:12 am

Postby gt13 » Mon Feb 04, 2008 5:55 am

@ cwiekol
It seems to prove that there is a difference between accented letters (used in French for instance) and some more complicated diacritics.
I will include "POZWOLIŁEM" in a next version of my tests ! (=> LATER EDIT: it is now done)
Sorry
Last edited by gt13 on Sun Feb 24, 2008 1:49 pm, edited 2 times in total.
gt13
X1 Power User
X1 Power User
 
Posts: 64
Joined: Sat Apr 17, 2004 10:09 am
Location: Marseille, France

Postby cwiekol » Thu Feb 07, 2008 2:22 am

it seems to me that only Windows desktop search "know" everything about diacritics..

don't use "pozwoliłem", use ŚRÓDŁĄCZĘ this same as SRODLACZE;)

regards
Michal
cwiekol
 
Posts: 24
Joined: Mon Jan 28, 2008 9:12 am

Postby cwiekol » Fri Feb 22, 2008 7:59 am

it's pitty that no one from X1 Team do anything about this (at least they don't inform about it).
Please, give me an information if you (x1 team) would do anything about this issue..
regards
Michal
cwiekol
 
Posts: 24
Joined: Mon Jan 28, 2008 9:12 am

Postby w0qj » Sun Feb 24, 2008 10:15 am

As far as I know, Unicode support (eg: accented characters) is not even on the roadmap of future development for X1.

It may be a looong time, if ever, that there'll be Unicode support...


Suggest if folks feel so strongly about Unicode support, suggest you start a petition thread in this same Feature Request Forum. Tks!
Rgds / Mr. Wong

"It is what X1 can do with the information found that is important."
w0qj
X1 Guru
X1 Guru
 
Posts: 1183
Joined: Wed Jun 16, 2004 3:53 am
Location: Hong Kong

Postby pgk » Sun Feb 24, 2008 12:50 pm

Perhaps a developer can comment on this. To me this seems like it would be a one day programming job at most for a single person.

Step 1: write an optional routine in the indexer that converts 'special characters' to their simplified version. This is a simple lookup table that would contain perhaps 30 characters to consider.

Step 2: add a single checkbox in the options panel to turn off or on the indexing of diacritics. If it's off, use the lookup table, if it's on, don't use the lookup table.

Am I underestimating the complexities here? If not, PLEASE add it to the next beta. -pgk
pgk
X1 Super User
X1 Super User
 
Posts: 757
Joined: Thu Dec 18, 2003 8:13 am

Postby gt13 » Sun Feb 24, 2008 2:03 pm

@ pgk :
I fully agree. Ideally, the user could even modify this lookup table, and customize it to his/her needs.

@ w0qj :
And if there is a petition on this topic, I will subscribe!

Gerard
gt13
X1 Power User
X1 Power User
 
Posts: 64
Joined: Sat Apr 17, 2004 10:09 am
Location: Marseille, France

Postby Kenward » Sun Feb 24, 2008 2:56 pm

While coding this task might be easy, I wouldn't know, that is not the only thing that matters. If enabling this slows down X1 to such a state that it is unusable by everyone who does not need this feature, then I vote against it.

My guess is that were it a "no brainer" it would have happened long ago.
MK
X1 Search 8.5.2 - Build 6001si (64-bit)
Windows 10 Pro 64-bit | Windows 10 Home 32-bit
No, I have nothing to do with X1, just a user since 2004.
Kenward
X1 Guru
X1 Guru
 
Posts: 4059
Joined: Tue Apr 20, 2004 2:35 am
Location: UK

Postby pgk » Sun Feb 24, 2008 6:11 pm

First off, it it could slow down indexing, a simple warning in the options panel should do. I don't think that's the case though. Either way, I'd like someone from X1 to comment on this. If it's hard, I'd really appreciate some hints about the challenges. -pgk
pgk
X1 Super User
X1 Super User
 
Posts: 757
Joined: Thu Dec 18, 2003 8:13 am

Postby Tod » Mon Feb 25, 2008 10:54 am

OK, first, X1 is looking into this and trying to figure out how best to re-activate this feature (since X1 used to ignore diacritics)

Second, it would not slow down indexing, but it would require a COMPLETE reindex when switching from indexing with diacritics to without diacritics and vice-versa. While some users might be OK with this, some other users could get very upset by it.

Third, the overall consideration is not the time it takes to code the feature, but the customer support questions - how do we expose the setting, what kind of warning boxes do you have to go through, and what possible nasty bugs might emerge.

We will definitely let you know if we make the change and will ask you to act as the first beta group.
Tod
X1 Rep
X1 Rep
 
Posts: 178
Joined: Thu Apr 07, 2005 8:26 pm
Location: Pasadena, CA

Postby pgk » Mon Feb 25, 2008 12:21 pm

Great, thanks for the response. I recommend looking how others resolved this. Check in Vista - you need to dig deep down in advanced settings, and there is a nice warning message. Once that's in place, it's up to the user whether or not to take the plunge.

If you want to 'one-up' the competition, you could allow 'diacritics users' to toggle 'diacritics based searches', since once they HAVE been indexed, ignoring them in an actual search should be relatively easy. But perhaps that is for beta two. -pgk

PS. If you generate an X1 build for Vista 64bit, I'd be happy to test drive it
pgk
X1 Super User
X1 Super User
 
Posts: 757
Joined: Thu Dec 18, 2003 8:13 am

Postby Kenward » Mon Feb 25, 2008 2:23 pm

Great. Now that we know that you are on the case, you may well be flooded out with volunteers to do some testing for you.

If I can make a, probably insulting and almost certainly unwelcome, observation, this sort of failure to recognise that the rest of the world does things differently, with its funny accents and stuff, hobbles too many IT businesses in the USA.

Microsoft went global years ago, which is why it is quicker to offer such features.

If X1 can offer a decent "diacritics" option, it could open up large and growing markets. Get ahead of the pack and "they will come".
MK
X1 Search 8.5.2 - Build 6001si (64-bit)
Windows 10 Pro 64-bit | Windows 10 Home 32-bit
No, I have nothing to do with X1, just a user since 2004.
Kenward
X1 Guru
X1 Guru
 
Posts: 4059
Joined: Tue Apr 20, 2004 2:35 am
Location: UK

Postby cwiekol » Thu Mar 06, 2008 7:48 am

thanks god someone from x1 responsed for this issue. Now i will check this forum more often to look up for this beta. I'm tired using microsoft desktop search (what a slow, limited, stupid tool)!!
regards
Michal
cwiekol
 
Posts: 24
Joined: Mon Jan 28, 2008 9:12 am

Postby cwiekol » Mon Mar 31, 2008 7:20 am

i've just bought x1 licence because i'm sick of using WDS and Copernic DS. I hope i will soon see new X1 beta release with "diacritics indexing option" !!
regards
Michal
cwiekol
 
Posts: 24
Joined: Mon Jan 28, 2008 9:12 am

Is TOD an X1 employed?

Postby FORTUETA » Tue Apr 08, 2008 4:10 am

Hello friends:

I haven´t seen this forum since the first days.

Is TOD an X1 employed? Is X1 really fixing this bug?

(i´m not and english used and is difficult to me to understand at alll users posts).

If X1 developes a beta version with this feature, I can test it too.
I have been waiting it for years and many spanish bussiness that i have shown X1 declined buying it fir this problem
FORTUETA
 
Posts: 27
Joined: Wed Sep 26, 2007 2:03 am

Tod is an employed (now I know it)

Postby FORTUETA » Tue Apr 08, 2008 4:19 am

Tod is an employed (now I know it)

Can anybody of X1 team tell us about the developement of the accent characters fix?

TY
FORTUETA
 
Posts: 27
Joined: Wed Sep 26, 2007 2:03 am

Any new witc accented characters?

Postby FORTUETA » Sun Jul 12, 2009 7:27 am

Hello friends,
Any new witc accented characters?
FORTUETA
 
Posts: 27
Joined: Wed Sep 26, 2007 2:03 am

Postby tjh » Sun Jul 12, 2009 8:07 am

v6.5 and above has support of double byte characters. I assume (but don't know) if this means it supports accented characters. Give it a test.
TiM
X1 Search v8.5.1 - 6001se (64 Bit)
tjh
X1 Super User
X1 Super User
 
Posts: 398
Joined: Sat Apr 12, 2008 4:12 am
Location: Napier, New Zealand

Postby gt13 » Sun Jul 12, 2009 9:00 am

@ tjh,
I don't think so.
And I am fed up with testing new releases that do no improve anything from this point of view. Each test means some hours of time waste, and very often it is even difficult to retrieve the old version working fine after the test!

But since you have it already installed, you can just test it easily,

either using this complete procedure:

gt13 wrote:I just propose something to test your Desktop Search softwares.
The result is here (Excel file): http://snipurl.com/7fugo
And you can get instructions and the files in order to do the same here (zip file): http://snipurl.com/7fumt


either a simplified one: just retrieve the ZIP file http://snipurl.com/7fumt , unpack, index, and search for the two words "unusualword eczé", and then for "unusualword ecze".

And tell us...

Thanks.
Gerard
gt13
X1 Power User
X1 Power User
 
Posts: 64
Joined: Sat Apr 17, 2004 10:09 am
Location: Marseille, France

Postby tjh » Sun Jul 12, 2009 9:35 am

Searching for "unusualword eczé" (with quotes around) returns Readme34.doc and Readme34.pdf

There's no need to even unpack, X1 searches in the zip file.

It seems to pass the tests that pack offers.

This is using Blackbird Beta II
TiM
X1 Search v8.5.1 - 6001se (64 Bit)
tjh
X1 Super User
X1 Super User
 
Posts: 398
Joined: Sat Apr 12, 2008 4:12 am
Location: Napier, New Zealand

Postby gt13 » Sun Jul 12, 2009 10:02 am

Please unpack the zip, and don't use the quotes.
You will perhaps get a little bit more answers.

Note that people in this topic would like to get all the 38 files for each request:
unusualword eczé
unusualword ecze

That is why we say that these versions are unusable for people using accented letters!

Gerard
gt13
X1 Power User
X1 Power User
 
Posts: 64
Joined: Sat Apr 17, 2004 10:09 am
Location: Marseille, France

Postby tjh » Sun Jul 12, 2009 10:18 am

Ok, removing the quotes I get 30 results. i.e. it seems to find all instances of it.

I don't get results from file 04_indexation_test_DOS.txt, nor from the 14 XLS file, which is the one that tests comments.

But overall, it seems to work well. It returns more results than Windows Desktop Search 4 is returning. I'll install Copernic Professional later and give that a test as well if you like.

I can also make a VM available to you using VNC if you want to do some tests yourself.

Cheers,
Tim
TiM
X1 Search v8.5.1 - 6001se (64 Bit)
tjh
X1 Super User
X1 Super User
 
Posts: 398
Joined: Sat Apr 12, 2008 4:12 am
Location: Napier, New Zealand

Postby gt13 » Sun Jul 12, 2009 10:49 am

Thanks for the test.

It seems that there are indeed some improvements since the last time I was testing X1.
If you also get the IPTC data embedded in 17_indexation_test_IPTC.jpg, it becomes all the more interesting!

Ok also for a VNC run. I tried to join you on MSN, but your MSN Id seems to be outdated.

Gerard
Last edited by gt13 on Sun Jul 12, 2009 12:19 pm, edited 1 time in total.
gt13
X1 Power User
X1 Power User
 
Posts: 64
Joined: Sat Apr 17, 2004 10:09 am
Location: Marseille, France

Postby tjh » Sun Jul 12, 2009 11:10 am

Sorry, nothing for test 17!

I'm running a few various things in my VM, let me get it cleaned up and accessible remotely and I'll PM you some details.

Also, I've just turned on my MSN, tim[at]muppetz dot com is correct.
TiM
X1 Search v8.5.1 - 6001se (64 Bit)
tjh
X1 Super User
X1 Super User
 
Posts: 398
Joined: Sat Apr 12, 2008 4:12 am
Location: Napier, New Zealand

Postby gt13 » Sun Jul 12, 2009 12:26 pm

After some testing, there is no miracle.
Indeed, the last version of X1 finds files when one asks for the exact accented word, but looking for eczema will not find eczéma.

X1 is still unusable for people using accented languages.
gt13
X1 Power User
X1 Power User
 
Posts: 64
Joined: Sat Apr 17, 2004 10:09 am
Location: Marseille, France

Postby Kenward » Sun Jul 12, 2009 2:18 pm

What happens when you search for:

"eczema OR eczéma"

Seeking

"écosse OR ecosse"

works just fine here.

So, as I understand it, your beef is really that X1 cannot, of its own free will, interpret ȇ, ȅ, ȩ, è, é and/or ë as e.

I dive in here late, again, and I am not going to faff around running complicated tests. I just throw in this observation so that people who come across this discussion do not get the impression that x1 cannot handle accented characters. It can. But perhaps not in the way that everyone would like.

It does seem like a good idea to treat all the flavours of a character as a "basic" letter. But then what happens when someone wants to to find écosse but not ecosse?
MK
X1 Search 8.5.2 - Build 6001si (64-bit)
Windows 10 Pro 64-bit | Windows 10 Home 32-bit
No, I have nothing to do with X1, just a user since 2004.
Kenward
X1 Guru
X1 Guru
 
Posts: 4059
Joined: Tue Apr 20, 2004 2:35 am
Location: UK

Postby gt13 » Sun Jul 12, 2009 3:24 pm

We (non English computer users) have been explaining the problem since 2005. You can go back to the first posts of this topic to understand the problem.

It is useless to try to explain us your way of thinking. This topic could be renamed (I know, it is too long):
"As long as X1 is not able to handle correctly the problem of accented letters explained many times, X1 can not be considered as a SEARCH engine for non-English people, and consequently it cannot penetrate business market".

For this reason, we do not understand why X1 does not make any effort to solve the problem.
Microsoft did the job (you can choose in Windows Search if you want to make a difference between "e" and "é"). Windows Search is also able to search IPTC data embedded in pictures, which is also a great feature. For end users like me, it still lacks (for some time?) some important features, like Thunderbird email indexing (should be solved with the next Thunderbird 3 release).

Gerard
gt13
X1 Power User
X1 Power User
 
Posts: 64
Joined: Sat Apr 17, 2004 10:09 am
Location: Marseille, France

At least accent option (v 6.5)

Postby FORTUETA » Sun Jul 12, 2009 11:24 pm

Hello I have installed a new beta.

There is an option "ignore accents".

I have made a test (2 different *.txt one with the word "habitación" the other with "habitación")

When I search habitación I have 2 results.
When I search habitacion I have 2 results too.

It seems to work.

I´ll make more test.
FORTUETA
 
Posts: 27
Joined: Wed Sep 26, 2007 2:03 am

Postby Kenward » Mon Jul 13, 2009 1:12 am

You will find this feature in:

>>Options
>>Indexing

There is also an option:

"Insert word breaks around Asian language characters"

Here is the relevant section in the Help file:

From the indexing options on the right, under Indexing Options, click to select character options:

Ignore accents in characters: If you select this option, X1 will treat letters with accents the same as the letter without an accent. For example, é and e will be both be considered to be the accent-less letter e. If you leave this option unchecked, X1 will differentiate between a letter with and without an accent during searches.

Insert word breaks around Asian language characters: If you select this option, X1 will insert spaces around Asian-character words. If you are performing an exact-term search, X1 will be able to locate a word for you. If you do not select this option, X1 may not be able to tell where a word with Asian characters ends for exact-term searches.

If X1 has already created an index of your items, you will see a message alerting you that a new index will need to created. Click OK to allow X1 to clear the old index and create a new one incorporating your character settings.

I doubt if this will meet all the needs of those who want to conduct really subtle searches, but it might help those with lesser demands.
MK
X1 Search 8.5.2 - Build 6001si (64-bit)
Windows 10 Pro 64-bit | Windows 10 Home 32-bit
No, I have nothing to do with X1, just a user since 2004.
Kenward
X1 Guru
X1 Guru
 
Posts: 4059
Joined: Tue Apr 20, 2004 2:35 am
Location: UK

Postby gt13 » Tue Jul 14, 2009 12:19 am

Thanks guys. Great news concerning accented letters !

I also noticed that the indexing of removable drives came back to X1, starting from X1® Professional Client Blackbird Beta II (Build 3840): http://forums.x1.com/viewtopic.php?t=4057

I will try the new version when I have some spare time.

Gerard
I need a desktop search engine that indexes removable drives, network drives, Thunderbird, image's metadata, Evernote, and works fine with accented letters.
gt13
X1 Power User
X1 Power User
 
Posts: 64
Joined: Sat Apr 17, 2004 10:09 am
Location: Marseille, France

another disappointment

Postby cwiekol » Mon Sep 14, 2009 3:56 am

i've just installed 6.6.5 (Build 3904) with a hope that i could come back to X1.
and guess what.. it still doesn't recognize polish diacritics (ąśżźćęółń)..


you should be ashamed of yourself!
cwiekol
 
Posts: 24
Joined: Mon Jan 28, 2008 9:12 am

Postby gt13 » Mon Sep 14, 2009 4:53 am

OK. Thanks for the feedback !
I need a desktop search engine that indexes removable drives, network drives, Thunderbird, image's metadata, Evernote, and works fine with accented letters.
gt13
X1 Power User
X1 Power User
 
Posts: 64
Joined: Sat Apr 17, 2004 10:09 am
Location: Marseille, France

Re: Accented Characters

Postby gt13 » Sun Aug 29, 2010 10:00 am

I just have made a test with the last 6.7.1 version.
Of course, I use the settings (before building the index): Tools > Options > Indexing > "Ignore accents in characters" enabled

Nothing new with polish diacritics: if there is POZWOLIŁEM in the file,
searching the exact word POZWOLIŁEM succeeds,
but searching the word POZWOLILEM (with L instead of Ł) fails.

For the detailed results, I have updated my test procedure with some new tests (OpenOffice and PowerPoint formats):
The result is here (Excel file): http://snipurl.com/7fugo
And you can get instructions and the files in order to do the same here (zip file): http://snipurl.com/7fumt

Gerard
I need a desktop search engine that indexes removable drives, network drives, Thunderbird, image's metadata, Evernote, and works fine with accented letters.
gt13
X1 Power User
X1 Power User
 
Posts: 64
Joined: Sat Apr 17, 2004 10:09 am
Location: Marseille, France

Re: Accented Characters

Postby cwiekol » Fri Nov 19, 2010 7:01 am

hi Guys,

do we have any improvment on this issue?

I'm dreaming about this!!!!!
cwiekol
 
Posts: 24
Joined: Mon Jan 28, 2008 9:12 am

Next

Return to Feature Requests and Gripes

Who is online

Users browsing this forum: No registered users and 7 guests

cron