Strings concatenation optimisation
- Ivan Denisov
 - Posts: 366
 - Joined: Tue Sep 17, 2013 12:21 am
 - Location: Krasnoyarsk, Russia
 
Strings concatenation optimisation
Please take a look for the optimization feature suggested by Oleg. 
https://forum.oberoncore.ru/viewtopic.p ... 11#p108850
			
			
									
						
										
						https://forum.oberoncore.ru/viewtopic.p ... 11#p108850
- 
				Josef Templ
 - Posts: 263
 - Joined: Tue Sep 17, 2013 6:50 am
 
Re: Strings concatenation optimisation
I cannot see the problem with statements like s := s + "xxx" in BB.
Can you give an example test case that fails in BB?
- Josef
			
			
									
						
										
						Can you give an example test case that fails in BB?
- Josef
Re: Strings concatenation optimisation
the output code is correct, Oleg  complain the qulity of the output code:
the output  instead of  
copy(write) itself is not necessary and inefficient, get(read) it's length is better;
but I think the code generater is good enough,look the main code of a:=a+'str';	
the optmization what I can do is:
Is it worth it? need change the OPV OPC OPL three file;
the string plus only suitable for short string, large one will cause stack overflow; the code use stack buffer when needed;
the optmization is valuable for large string operation, untill now we only use it for short ones;
			
			
									
						
										
						the
Code: Select all
 a :=a +'str';Code: Select all
a :=a; a+="str";Code: Select all
 a.append('str');but I think the code generater is good enough,look the main code of a:=a+'str';
Code: Select all
LEA	EDI, [EBP-200]	
MOV	ECX, 100	
;===========================
LEA	ESI, [EBP-200]	
LODSW		
OR	AX, AX	
JZ	7 (66830099H)	
STOSW	               ; add di,2	
DEC	ECX	
JNZ	-12 (6683008BH)	
TRAP	8	; copyTrap
;============================
MOV	ESI, 66830068H	
LODSW		
STOSW	the optmization what I can do is:
Code: Select all
 add di,2;  STOSW	  	the string plus only suitable for short string, large one will cause stack overflow;
Code: Select all
	PROCEDURE Do* ();
		VAR s: POINTER TO ARRAY OF CHAR;  
	BEGIN
		NEW(s, 1024 * 1024 * 2);
		s^ := s^ + 'a';   // ok 
		s^ := 'a' + s^;  (*stack ovreflow!! *)
	END Do;
the optmization is valuable for large string operation, untill now we only use it for short ones;
- 
				Josef Templ
 - Posts: 263
 - Joined: Tue Sep 17, 2013 6:50 am
 
Re: Strings concatenation optimisation
> the output code is correct, Oleg complain the qulity of the output code:
Thanks for the clarification.
In general, when optimizing statements like
s := s + ... ;
one must be careful about detecting the correct pattern.
The following statement may fail if there is no buffer allocated for the resulting string.
s := s + "xxx" + s;
I hope that Oleg checked for this.
- Josef
			
			
									
						
										
						Thanks for the clarification.
In general, when optimizing statements like
s := s + ... ;
one must be careful about detecting the correct pattern.
The following statement may fail if there is no buffer allocated for the resulting string.
s := s + "xxx" + s;
I hope that Oleg checked for this.
- Josef
Re: Strings concatenation optimisation
Now the operation of the form
implemented as follows: the string s is copied into itself in a loop, which also searches for the terminating 0X and also checks the string (buffer) length, for generating trap when out of the string. Basic copying is done using instructions lodsw:stosw (for CHAR) and lodsb:stosb (for SHORTCHAR).
It is obvious that the optimization of this solution may not be strong, so I would have just suggest to remove copying. Let only stays search for 0X with a checking for out-of-string case. How exactly to implement it better - I have not researched. The i386 processor doesn't have a stand-alone command for searching 0X in a memory, as far as I know. Therefore, the search cycle is still inevitable.
			
			
									
						
										
						Code: Select all
s := s$ + "something"It is obvious that the optimization of this solution may not be strong, so I would have just suggest to remove copying. Let only stays search for 0X with a checking for out-of-string case. How exactly to implement it better - I have not researched. The i386 processor doesn't have a stand-alone command for searching 0X in a memory, as far as I know. Therefore, the search cycle is still inevitable.
Re: Strings concatenation optimisation
What about 
, such operation is now decomposed into three operations:
And the buffer overflow check happens in every operation here. And the search for the final 0X occurs here even twice.
			
			
									
						
										
						Code: Select all
s := s + "xxx" + s;Code: Select all
s := s; s += "xxx"; s += s;Re: Strings concatenation optimisation
Rethink Oleg's question, the strings  operation need some improvement;
1, the form: a:=a+'str'; (no buffer needed)
old : a :=a; a += 'str';
new: a := (scan 0X); a+="str";
improve point: read a instead copy a;
2,the form: a := b + a; (need buffer)
old: a' =b; a'+= a; a:=a';
new: a':=a; a:=b; a+=a';
improve point: use a' as src buffer, short the copy length;
3,do we need restrict the buffer length ?
threshold =4096;
if LEN(buf)>threshold AllocHeap() or error(I preffer)
else AllocStack()
end;
luowy
			
			
									
						
										
						1, the form: a:=a+'str'; (no buffer needed)
old : a :=a; a += 'str';
new: a := (scan 0X); a+="str";
improve point: read a instead copy a;
2,the form: a := b + a; (need buffer)
old: a' =b; a'+= a; a:=a';
new: a':=a; a:=b; a+=a';
improve point: use a' as src buffer, short the copy length;
3,do we need restrict the buffer length ?
threshold =4096;
if LEN(buf)>threshold AllocHeap() or error(I preffer)
else AllocStack()
end;
luowy
- 
				Josef Templ
 - Posts: 263
 - Joined: Tue Sep 17, 2013 6:50 am
 
Re: Strings concatenation optimisation
If s = "" at the beginning, then s := s + "xxx" + s; should set s to "xxx", right?Zorko wrote:What about, such operation is now decomposed into three operations:Code: Select all
s := s + "xxx" + s;
And the buffer overflow check happens in every operation here. And the search for the final 0X occurs here even twice.Code: Select all
s := s; s += "xxx"; s += s;
With the proposed decomposed form you would get "xxxxxx", which is wrong.
- Josef
Re: Strings concatenation optimisation
luowy is right. I like that he is open to new ideas.
Josef has a point too.
Please, answer the question: do we have a technical possibility at the time of constructing the concatenation operation of the form s := s + "a" + s to find out whether the string s is present in the chain of added strings? If the operation does not contain the source string to concatenation, then we do not need a buffer at all. We use this string as a buffer, thus saving and optimizing the concatenation.
So our task is to find: can a string be the buffer or a separate buffer is required. I'm going to implement this strategy in Ofront+
As far as I understand, now BlackBox always uses the buffer.
			
			
									
						
										
						Josef has a point too.
Please, answer the question: do we have a technical possibility at the time of constructing the concatenation operation of the form s := s + "a" + s to find out whether the string s is present in the chain of added strings? If the operation does not contain the source string to concatenation, then we do not need a buffer at all. We use this string as a buffer, thus saving and optimizing the concatenation.
Code: Select all
s := s + "a" + b + c; (* s can be used as buffer *)
s := "a" + b + c + s; (* s can not be used as buffer *)As far as I understand, now BlackBox always uses the buffer.
Re: Strings concatenation optimisation
no! the current compiler has already done this optimization, check my previous post (form 1); or you can check it by decoder(e.g. open code file)Zorko wrote: As far as I understand, now BlackBox always uses the buffer.