Utf8 converter is missing in HostTextConv.

Usage of the framework, compiler and tools

Utf8 converter is missing in HostTextConv.

Postby Zinn » Mon Apr 22, 2019 5:51 pm

In the past I work with Linux on my notebook and with Windows 10 on my desktop computer. Now I changed may desktop also to Linux Mint Cinnamon 19.1 Tessa.
Now I realised that Windows log and text files still using old code page format but Linux uses modern utf8 format.

We have in HostTextConv the converter for Text, RichText and Unicode but I miss the converter for Utf8.

Shouldn't we add the procedures ImportUtf8 & ExportUtf8 from CpcUtf8Conv to the HostTextConv?
What is your opinion?

- Helmut
Zinn
 
Posts: 73
Joined: Mon Nov 24, 2014 10:47 am

Re: Utf8 converter is missing in HostTextConv.

Postby Ivan Denisov » Thu Apr 25, 2019 5:26 pm

Good initiative.
User avatar
Ivan Denisov
 
Posts: 271
Joined: Tue Sep 17, 2013 12:21 am
Location: Krasnoyarsk, Russia

Re: Utf8 converter is missing in HostTextConv.

Postby Ivan Denisov » Fri Apr 26, 2019 5:36 am

Zinn wrote:Now I changed may desktop also to Linux Mint Cinnamon 19.1 Tessa.

Also installed it yesterday :)

After BIOS update in Windows destroy the order of my UEFI records from which Debian usually starting, I decided to try Mint, because it uses my favorite Cinnamon desktop (I used it in Debian) and at the same time it is based on Ubuntu with good support of CUDA and Nvidia Drivers (need for molecular modeling).

I have checked, that BlackBox for Linux is working well if you use this commands set:
Code: Select all
wget http://dev.obertone.ru/dev.obertone.ru.gpg.key
sudo apt-key add dev.obertone.ru.gpg.key
sudo dpkg --add-architecture i386
echo "deb [arch=i386] http://dev.obertone.ru/bionic testing main" | sudo tee -a /etc/apt/sources.list
sudo apt-get update
sudo apt-get install bbcb
User avatar
Ivan Denisov
 
Posts: 271
Joined: Tue Sep 17, 2013 12:21 am
Location: Krasnoyarsk, Russia

Re: Utf8 converter is missing in HostTextConv.

Postby Robert » Fri Apr 26, 2019 9:23 pm

Zinn wrote:Shouldn't we add the procedures ImportUtf8 & ExportUtf8 from CpcUtf8Conv to the HostTextConv?
What is your opinion?

Seems like a reasonable idea to me.

Are the existing converters documented anywhere - should they be?
User avatar
Robert
 
Posts: 135
Joined: Sat Sep 28, 2013 11:04 am
Location: Edinburgh, Scotland

Re: Utf8 converter is missing in HostTextConv.

Postby Josef Templ » Sat Apr 27, 2019 10:02 am

adding utf8 converters is a good idea for me.
What would be the file name extension? Is there any standard for that?

- Josef
Josef Templ
 
Posts: 227
Joined: Tue Sep 17, 2013 6:50 am

Re: Utf8 converter is missing in HostTextConv.

Postby Zinn » Sat Apr 27, 2019 12:04 pm

Josef Templ wrote:What would be the file name extension? Is there any standard for that?

In Windows I don't know any standard extension.
In Linux I often use it with extension .sh (bash script).
Original the the Importer / Exporter was used by Romiras with extension .cp

Further in Windows there is CR and LF at end of line
and in Linux LF only at end of line

- Helmut
Zinn
 
Posts: 73
Joined: Mon Nov 24, 2014 10:47 am

Re: Utf8 converter is missing in HostTextConv.

Postby Zinn » Tue Apr 30, 2019 7:21 am

I prefer to select all files, because all txt and log files in Linux are using Utf-8
- Helmut
Zinn
 
Posts: 73
Joined: Mon Nov 24, 2014 10:47 am

Re: Utf8 converter is missing in HostTextConv.

Postby Josef Templ » Wed May 08, 2019 7:14 pm

.utf8 seems to be the 'official' extension for UTF-8 text files.
In addition, there could be the Converters.importAll flag set for such a converter.
I don't know exactly what it changes, though.
For decoding arbitrary binary files, however, this may lead to utf-8 decoding errors and the
question is how to handle them, e.g. output a question mark or similar.

When reading, CR-LF and LF can both be treated equally.
When writing, it may be a good idea to adapt to the platform (Win or Wine)

- Josef
Josef Templ
 
Posts: 227
Joined: Tue Sep 17, 2013 6:50 am

Re: Utf8 converter is missing in HostTextConv.

Postby Zinn » Sat May 11, 2019 5:55 pm

Hello Josef,
the module at http://www.zinnamturm.eu/downloadsAC.htm#CpcUtf8Conv worked as you describe during reading CR-LF and LF.
The Writing is adapted to the Wine platform. But it does not check any errors. The part about SetFont should be deleted.
Of course the extension can changed to .utf8
Helmut
Zinn
 
Posts: 73
Joined: Mon Nov 24, 2014 10:47 am

Re: Utf8 converter is missing in HostTextConv.

Postby Zinn » Mon Jul 01, 2019 6:28 am

Here there are my change porposal to this topic

Host/Mod/TextConv.odc
(1)
Code: Select all
   PROCEDURE ReadChar (VAR rd: Stores.Reader; VAR ch: CHAR);
      VAR
         c1, c2, c3: BYTE;
   BEGIN   (* UTF-8 format *)
      rd.ReadByte(c1);
      ch := CHR(c1);
      IF c1 < 0 THEN (* c1 < 0 &  c1 > -64 = C0 = 110x xxxx *)
         rd.ReadByte(c2);
         ch := CHR(64 * (c1 MOD 32) + (c2 MOD 64));
         IF c1 >=  - 32 THEN (* c1 < 0 & c1 >= -32 = E0 = 1110 xxxxx *)
            rd.ReadByte(c3);
            ch := CHR(4096 * (c1 MOD 16) + 64 * (c2 MOD 64) + (c3 MOD 64));
         END;
      END;
   END ReadChar;

   PROCEDURE ImportUtf8* (f: Files.File; OUT s: Stores.Store);
      VAR r: Stores.Reader; t: TextModels.Model; wr: TextModels.Writer; ch, nch: CHAR;
   BEGIN
      ASSERT(f # NIL, 20);
      r.ConnectTo(f); r.SetPos(0);
      t := TextModels.dir.New(); wr := t.NewWriter(NIL);
      ReadChar(r, ch);
      WHILE ~r.rider.eof DO
         ReadChar(r, nch);
         IF (ch = CR) & (nch = LF) THEN ReadChar(r, nch)
         ELSIF ch = LF THEN ch := CR
         END;
         wr.WriteChar(ch);
         ch := nch;
      END;
      s := TextViews.dir.New(t)
   END ImportUtf8;

Host/Mod/TextConv.odc
(2)
Code: Select all
   PROCEDURE WriteChar (VAR wr: Stores.Writer; ch: CHAR);
   BEGIN   (* UTF-8 format *)
      IF ch <= 7FX THEN
         wr.WriteByte(SHORT(SHORT(ORD(ch))))
      ELSIF ch <= 7FFX THEN
         wr.WriteByte(SHORT(SHORT( - 64 + ORD(ch) DIV 64)));
         wr.WriteByte(SHORT(SHORT( - 128 + ORD(ch) MOD 64)))
      ELSE
         wr.WriteByte(SHORT(SHORT( - 32 + ORD(ch) DIV 4096)));
         wr.WriteByte(SHORT(SHORT( - 128 + ORD(ch) DIV 64 MOD 64)));
         wr.WriteByte(SHORT(SHORT( - 128 + ORD(ch) MOD 64)))
      END
   END WriteChar;

   PROCEDURE ExportUtf8* (s: Stores.Store; f: Files.File);
      VAR w: Stores.Writer; t: TextModels.Model; r: TextModels.Reader; ch: CHAR;
   BEGIN
      ASSERT(s # NIL, 20); ASSERT(f # NIL, 21);
      s := TextView(s);
      IF s # NIL THEN
         w.ConnectTo(f); w.SetPos(0);
         t := s(TextViews.View).ThisModel();
         IF t # NIL THEN
            r := t.NewReader(NIL);
            r.ReadChar(ch);
            WHILE ~r.eot DO
               IF ch = CR THEN WriteChar(w, LF) ELSE WriteChar(w, ch) END;
               r.ReadChar(ch)
            END
         END
      END
   END ExportUtf8;

System/Mod/Config.odc
Code: Select all
   Converters.Register("HostTextConv.ImportUtf8", "HostTextConv.ExportUtf8", "TextViews.View", "utf8", {Converters.importAll});
   Converters.Register("HostTextConv.ImportText", "HostTextConv.ExportText", "TextViews.View", "txt", {});

Host/Rsrc/Strings.odc
Code: Select all
HostTextConv.ImportUtf8   Linux Text
HostTextConv.ImportText   Window Text
...
HostTextConv.ExportUtf8   Linux Plain Text
HostTextConv.ExportText   Window Plain Text

This changes works under Wine on my Linux system.
- Helmut
Zinn
 
Posts: 73
Joined: Mon Nov 24, 2014 10:47 am

Next

Return to Common questions

Who is online

Users browsing this forum: No registered users and 1 guest

cron