Page 1 of 2
Utf8 converter is missing in HostTextConv.
Posted: Mon Apr 22, 2019 5:51 pm
by Zinn
In the past I work with Linux on my notebook and with Windows 10 on my desktop computer. Now I changed may desktop also to Linux Mint Cinnamon 19.1 Tessa.
Now I realised that Windows log and text files still using old code page format but Linux uses modern utf8 format.
We have in HostTextConv the converter for Text, RichText and Unicode but I miss the converter for Utf8.
Shouldn't we add the procedures ImportUtf8 & ExportUtf8 from CpcUtf8Conv to the HostTextConv?
What is your opinion?
- Helmut
Re: Utf8 converter is missing in HostTextConv.
Posted: Thu Apr 25, 2019 5:26 pm
by Ivan Denisov
Good initiative.
Re: Utf8 converter is missing in HostTextConv.
Posted: Fri Apr 26, 2019 5:36 am
by Ivan Denisov
Zinn wrote:Now I changed may desktop also to Linux Mint Cinnamon 19.1 Tessa.
Also installed it yesterday
After BIOS update in Windows destroy the order of my UEFI records from which Debian usually starting, I decided to try Mint, because it uses my favorite Cinnamon desktop (I used it in Debian) and at the same time it is based on Ubuntu with good support of CUDA and Nvidia Drivers (need for molecular modeling).
I have checked, that BlackBox for Linux is working well if you use this commands set:
Code: Select all
wget http://dev.obertone.ru/dev.obertone.ru.gpg.key
sudo apt-key add dev.obertone.ru.gpg.key
sudo dpkg --add-architecture i386
echo "deb [arch=i386] http://dev.obertone.ru/bionic testing main" | sudo tee -a /etc/apt/sources.list
sudo apt-get update
sudo apt-get install bbcb
Re: Utf8 converter is missing in HostTextConv.
Posted: Fri Apr 26, 2019 9:23 pm
by Robert
Zinn wrote:Shouldn't we add the procedures ImportUtf8 & ExportUtf8 from CpcUtf8Conv to the HostTextConv?
What is your opinion?
Seems like a reasonable idea to me.
Are the existing converters documented anywhere - should they be?
Re: Utf8 converter is missing in HostTextConv.
Posted: Sat Apr 27, 2019 10:02 am
by Josef Templ
adding utf8 converters is a good idea for me.
What would be the file name extension? Is there any standard for that?
- Josef
Re: Utf8 converter is missing in HostTextConv.
Posted: Sat Apr 27, 2019 12:04 pm
by Zinn
Josef Templ wrote:What would be the file name extension? Is there any standard for that?
In Windows I don't know any standard extension.
In Linux I often use it with extension .sh (bash script).
Original the the Importer / Exporter was used by Romiras with extension .cp
Further in Windows there is CR and LF at end of line
and in Linux LF only at end of line
- Helmut
Re: Utf8 converter is missing in HostTextConv.
Posted: Tue Apr 30, 2019 7:21 am
by Zinn
I prefer to select all files, because all txt and log files in Linux are using Utf-8
- Helmut
Re: Utf8 converter is missing in HostTextConv.
Posted: Wed May 08, 2019 7:14 pm
by Josef Templ
.utf8 seems to be the 'official' extension for UTF-8 text files.
In addition, there could be the Converters.importAll flag set for such a converter.
I don't know exactly what it changes, though.
For decoding arbitrary binary files, however, this may lead to utf-8 decoding errors and the
question is how to handle them, e.g. output a question mark or similar.
When reading, CR-LF and LF can both be treated equally.
When writing, it may be a good idea to adapt to the platform (Win or Wine)
- Josef
Re: Utf8 converter is missing in HostTextConv.
Posted: Sat May 11, 2019 5:55 pm
by Zinn
Hello Josef,
the module at
http://www.zinnamturm.eu/downloadsAC.htm#CpcUtf8Conv worked as you describe during reading CR-LF and LF.
The Writing is adapted to the Wine platform. But it does not check any errors. The part about SetFont should be deleted.
Of course the extension can changed to .utf8
Helmut
Re: Utf8 converter is missing in HostTextConv.
Posted: Mon Jul 01, 2019 6:28 am
by Zinn
Here there are my change porposal to this topic
Host/Mod/TextConv.odc
(1)
Code: Select all
PROCEDURE ReadChar (VAR rd: Stores.Reader; VAR ch: CHAR);
VAR
c1, c2, c3: BYTE;
BEGIN (* UTF-8 format *)
rd.ReadByte(c1);
ch := CHR(c1);
IF c1 < 0 THEN (* c1 < 0 & c1 > -64 = C0 = 110x xxxx *)
rd.ReadByte(c2);
ch := CHR(64 * (c1 MOD 32) + (c2 MOD 64));
IF c1 >= - 32 THEN (* c1 < 0 & c1 >= -32 = E0 = 1110 xxxxx *)
rd.ReadByte(c3);
ch := CHR(4096 * (c1 MOD 16) + 64 * (c2 MOD 64) + (c3 MOD 64));
END;
END;
END ReadChar;
PROCEDURE ImportUtf8* (f: Files.File; OUT s: Stores.Store);
VAR r: Stores.Reader; t: TextModels.Model; wr: TextModels.Writer; ch, nch: CHAR;
BEGIN
ASSERT(f # NIL, 20);
r.ConnectTo(f); r.SetPos(0);
t := TextModels.dir.New(); wr := t.NewWriter(NIL);
ReadChar(r, ch);
WHILE ~r.rider.eof DO
ReadChar(r, nch);
IF (ch = CR) & (nch = LF) THEN ReadChar(r, nch)
ELSIF ch = LF THEN ch := CR
END;
wr.WriteChar(ch);
ch := nch;
END;
s := TextViews.dir.New(t)
END ImportUtf8;
Host/Mod/TextConv.odc
(2)
Code: Select all
PROCEDURE WriteChar (VAR wr: Stores.Writer; ch: CHAR);
BEGIN (* UTF-8 format *)
IF ch <= 7FX THEN
wr.WriteByte(SHORT(SHORT(ORD(ch))))
ELSIF ch <= 7FFX THEN
wr.WriteByte(SHORT(SHORT( - 64 + ORD(ch) DIV 64)));
wr.WriteByte(SHORT(SHORT( - 128 + ORD(ch) MOD 64)))
ELSE
wr.WriteByte(SHORT(SHORT( - 32 + ORD(ch) DIV 4096)));
wr.WriteByte(SHORT(SHORT( - 128 + ORD(ch) DIV 64 MOD 64)));
wr.WriteByte(SHORT(SHORT( - 128 + ORD(ch) MOD 64)))
END
END WriteChar;
PROCEDURE ExportUtf8* (s: Stores.Store; f: Files.File);
VAR w: Stores.Writer; t: TextModels.Model; r: TextModels.Reader; ch: CHAR;
BEGIN
ASSERT(s # NIL, 20); ASSERT(f # NIL, 21);
s := TextView(s);
IF s # NIL THEN
w.ConnectTo(f); w.SetPos(0);
t := s(TextViews.View).ThisModel();
IF t # NIL THEN
r := t.NewReader(NIL);
r.ReadChar(ch);
WHILE ~r.eot DO
IF ch = CR THEN WriteChar(w, LF) ELSE WriteChar(w, ch) END;
r.ReadChar(ch)
END
END
END
END ExportUtf8;
System/Mod/Config.odc
Code: Select all
Converters.Register("HostTextConv.ImportUtf8", "HostTextConv.ExportUtf8", "TextViews.View", "utf8", {Converters.importAll});
Converters.Register("HostTextConv.ImportText", "HostTextConv.ExportText", "TextViews.View", "txt", {});
Host/Rsrc/Strings.odc
Code: Select all
HostTextConv.ImportUtf8 Linux Text
HostTextConv.ImportText Window Text
...
HostTextConv.ExportUtf8 Linux Plain Text
HostTextConv.ExportText Window Plain Text
This changes works under Wine on my Linux system.
- Helmut