Strings concatenation optimisation

Strings concatenation optimisation

Postby Ivan Denisov » Fri Oct 25, 2019 5:55 am

Please take a look for the optimization feature suggested by Oleg.
https://forum.oberoncore.ru/viewtopic.p ... 11#p108850
User avatar
Ivan Denisov
 
Posts: 277
Joined: Tue Sep 17, 2013 12:21 am
Location: Krasnoyarsk, Russia

Re: Strings concatenation optimisation

Postby Josef Templ » Sat Oct 26, 2019 6:52 am

I cannot see the problem with statements like s := s + "xxx" in BB.
Can you give an example test case that fails in BB?

- Josef
Josef Templ
 
Posts: 242
Joined: Tue Sep 17, 2013 6:50 am

Re: Strings concatenation optimisation

Postby luowy » Sat Oct 26, 2019 6:21 pm

the output code is correct, Oleg complain the qulity of the output code:
the
Code: Select all
 a :=a +'str';
output
Code: Select all
a :=a; a+="str";
instead of
Code: Select all
 a.append('str');

copy(write) itself is not necessary and inefficient, get(read) it's length is better;

but I think the code generater is good enough,look the main code of a:=a+'str';
Code: Select all
LEA   EDI, [EBP-200]   
MOV   ECX, 100   
;===========================
LEA   ESI, [EBP-200]   
LODSW      
OR   AX, AX   
JZ   7 (66830099H)   
STOSW                  ; add di,2   
DEC   ECX   
JNZ   -12 (6683008BH)   
TRAP   8   ; copyTrap
;============================
MOV   ESI, 66830068H   
LODSW      
STOSW   

the optmization what I can do is:
Code: Select all
 add di,2;  STOSW        

Is it worth it? need change the OPV OPC OPL three file;
the string plus only suitable for short string, large one will cause stack overflow;
Code: Select all
   PROCEDURE Do* ();
      VAR s: POINTER TO ARRAY OF CHAR; 
   BEGIN
      NEW(s, 1024 * 1024 * 2);
      s^ := s^ + 'a';   // ok
      s^ := 'a' + s^;  (*stack ovreflow!! *)
   END Do;
the code use stack buffer when needed;

the optmization is valuable for large string operation, untill now we only use it for short ones;
luowy
 
Posts: 51
Joined: Thu Dec 17, 2015 1:32 pm

Re: Strings concatenation optimisation

Postby Josef Templ » Sun Oct 27, 2019 8:19 am

> the output code is correct, Oleg complain the qulity of the output code:

Thanks for the clarification.

In general, when optimizing statements like
s := s + ... ;
one must be careful about detecting the correct pattern.
The following statement may fail if there is no buffer allocated for the resulting string.
s := s + "xxx" + s;

I hope that Oleg checked for this.

- Josef
Josef Templ
 
Posts: 242
Joined: Tue Sep 17, 2013 6:50 am

Re: Strings concatenation optimisation

Postby Zorko » Tue Oct 29, 2019 5:11 pm

Now the operation of the form

Code: Select all
s := s$ + "something"


implemented as follows: the string s is copied into itself in a loop, which also searches for the terminating 0X and also checks the string (buffer) length, for generating trap when out of the string. Basic copying is done using instructions lodsw:stosw (for CHAR) and lodsb:stosb (for SHORTCHAR).

It is obvious that the optimization of this solution may not be strong, so I would have just suggest to remove copying. Let only stays search for 0X with a checking for out-of-string case. How exactly to implement it better - I have not researched. The i386 processor doesn't have a stand-alone command for searching 0X in a memory, as far as I know. Therefore, the search cycle is still inevitable.
Zorko
 
Posts: 21
Joined: Tue Sep 17, 2013 10:13 pm
Location: Ukraine

Re: Strings concatenation optimisation

Postby Zorko » Tue Oct 29, 2019 5:15 pm

What about
Code: Select all
s := s + "xxx" + s;

, such operation is now decomposed into three operations:

Code: Select all
s := s; s += "xxx"; s += s;

And the buffer overflow check happens in every operation here. And the search for the final 0X occurs here even twice.
Zorko
 
Posts: 21
Joined: Tue Sep 17, 2013 10:13 pm
Location: Ukraine

Re: Strings concatenation optimisation

Postby luowy » Thu Oct 31, 2019 6:35 am

Rethink Oleg's question, the strings operation need some improvement;

1, the form: a:=a+'str'; (no buffer needed)
old : a :=a; a += 'str';
new: a := (scan 0X); a+="str";
improve point: read a instead copy a;

2,the form: a := b + a; (need buffer)
old: a' =b; a'+= a; a:=a';
new: a':=a; a:=b; a+=a';
improve point: use a' as src buffer, short the copy length;

3,do we need restrict the buffer length ?
threshold =4096;
if LEN(buf)>threshold AllocHeap() or error(I preffer)
else AllocStack()
end;

luowy
luowy
 
Posts: 51
Joined: Thu Dec 17, 2015 1:32 pm

Re: Strings concatenation optimisation

Postby Josef Templ » Fri Nov 01, 2019 9:01 am

Zorko wrote:What about
Code: Select all
s := s + "xxx" + s;

, such operation is now decomposed into three operations:

Code: Select all
s := s; s += "xxx"; s += s;

And the buffer overflow check happens in every operation here. And the search for the final 0X occurs here even twice.


If s = "" at the beginning, then s := s + "xxx" + s; should set s to "xxx", right?
With the proposed decomposed form you would get "xxxxxx", which is wrong.

- Josef
Josef Templ
 
Posts: 242
Joined: Tue Sep 17, 2013 6:50 am

Re: Strings concatenation optimisation

Postby Zorko » Mon Nov 04, 2019 3:57 am

luowy is right. I like that he is open to new ideas.

Josef has a point too.

Please, answer the question: do we have a technical possibility at the time of constructing the concatenation operation of the form s := s + "a" + s to find out whether the string s is present in the chain of added strings? If the operation does not contain the source string to concatenation, then we do not need a buffer at all. We use this string as a buffer, thus saving and optimizing the concatenation.

Code: Select all
s := s + "a" + b + c; (* s can be used as buffer *)
s := "a" + b + c + s; (* s can not be used as buffer *)

So our task is to find: can a string be the buffer or a separate buffer is required. I'm going to implement this strategy in Ofront+

As far as I understand, now BlackBox always uses the buffer.
Zorko
 
Posts: 21
Joined: Tue Sep 17, 2013 10:13 pm
Location: Ukraine

Re: Strings concatenation optimisation

Postby luowy » Mon Nov 04, 2019 10:54 am

Zorko wrote:As far as I understand, now BlackBox always uses the buffer.
no! the current compiler has already done this optimization, check my previous post (form 1); or you can check it by decoder(e.g. open code file)
luowy
 
Posts: 51
Joined: Thu Dec 17, 2015 1:32 pm

Next

Return to Feature

Who is online

Users browsing this forum: No registered users and 1 guest

cron