Language Character Sets
#11
What Rolf provided was perfect.

Code:
epixoip@ike:~$ printf "$(echo B9A8C9D6D3CAC5CDC3D8D9C7D5DADDC6C4CBCED0CFC0C2DBD4DFD7D1CCC8D2DCC1DEB8E9F6F3EAE5​EDE3F8F9E7F5FAFDE6E4EBEEF0EFE0E2FBF4FFF7F1ECE8F2FCE1FE | sed -r 's/[a-fA-F0-9]{2}/\\x&/g')" >rolf.txt

epixoip@ike:~$ cat rolf.txt
▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒޸▒▒▒▒▒​▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒

epixoip@ike:~$ iconv -f windows-1251 -t utf8 rolf.txt
№ЁЙЦУКЕНГШЩЗХЪЭЖДЛОРПАВЫФЯЧСМИТЬБЮёйцуке​нгшщзхъэждлорпавыфячсмитьбю

beautifully done, Rolf.
Reply
#12
Hah, thanks.

Any comments regarding the "â„–" character?
Should or should it not be included in charset file?
Technically, it's not part of the alphabet, but then Norsk charset file may contain the euro sign in the same fashion(I think they have a euro sign on their keyboards).
Reply
#13
If it's on the keyboard it should be in the charset file, absolutely.
Reply
#14
(02-07-2013, 12:36 PM)epixoip Wrote: If it's on the keyboard it should be in the charset file, absolutely.

^ This. Smile

I perhaps should have mentioned that in my original post.

I have a suspicion that the French one will be done today Smile

The Spanish one is ready but the Trac is still down.

Is there no one else who would like to help ? Anyone else have a request for a different character set ?
Reply
#15
If I refer to another thread and if I have understood well, for each language we need the "regular" and the "dos" charset so that the LM hash can also be cracked:

http://hashcat.net/forum/thread-2016.html

For example, Russian LM hash works with DOS Cyrillic encoding(CO866). I started testing german has well and for LM it seems to take Western europpean (DOS-850). As a request, the standard german charset.
Reply
#16
ah yeah, we need the dos charsets as well
Reply
#17
Looks like the Trac is working again.

Here is the ticket number where you can upload your files.

Thanks very much for your help. Smile
Reply
#18
that means that oclhahscat will be supporting words like " Coordinación" with the "ó" ??????
Reply
#19
(02-08-2013, 05:56 PM)eljolot Wrote: that means that oclhahscat will be supporting words like " Coordinación" with the "ó" ??????
exactly
Reply
#20
The trac is not working, so I'm going to make a post here. I have been asked to provide a comprehensive list of characters unique to the Spanish language. I have compiled a list which also incorporates the Catalan language. Now I will explain the differences between these languages to enable users to choose the best character set for a given task.

In Spain there are at least 5 languages spoken. The most common is Spanish with more than 450M people speaking it worldwide. Catalan or Valencian are the next most popular with an estimated 12M speakers. Understanding those figures alone will enable you to imagine/visualise the importance of each one.

"¿¡" In Spanish these inverted characters are positioned at the beginning of questions or interjections. The inclusion of these characters is omitted by most people when writing informally as the meaning usually remains the same. I am unsure about their inclusion in Spanish passwords as I have no data to assess. Personally I do use "¡" in my passwords as "¡" is a single key press, while "¿" is a combination with the shift key. I believe the combination keypress may deter people from including this character in their passwords.

Ñ is a letter that is only used in Spanish and Ç is used in Catalan/Valencian, but also in Portuguese and French.

About accent marks: Please read this (is long to post it here)
http://en.wikipedia.org/wiki/Spanish_ort...ed_letters
http://en.wikipedia.org/wiki/Diaeresis_%...%29#Hiatus

Probably the least commonly used characters are diaeresis, üÜ in Spanish and ïÏ in Catalan. Again, I am unsure if these are frequently included in passwords as it is faster to write words which do not include the accent mark, however it is still possible people may use them.

The remaining characters are:
€ - euro symbol (Alt Gr + E)
¨ - diaeresis (shift combination, position may vary)
ª - feminine ordinal indicator (shift + º)
´ - acute accent mark (shift combination, position may vary)
· - middle dot (shift + 3) [http://en.wikipedia.org/wiki/Interpunct]
º - masculine ordinal indicator (on the left of 1)

Here you can see a picture of a typical Spanish keyboard.

Then, these are the charsets I would do:

Spanish:
€¡¨ª´·º¿ÁÉÍÑÓÚÜáéíñóúü

Catalan:
€¡¨ª´·º¿ÀÇÈÉÍÏÒÓÚÜàçèéíïòóúü

Full:
€¡¨ª´·º¿ÀÁÇÈÉÍÏÑÒÓÚÜàáçèéíïñòóúü

Some characters are not supported by ISO-8859-1, ISO-8859-15 and Windows Codepage 1252. e.g.: €

With all this information available, what should be included in the charset/s?
Reply