Hershel 64-bit compiler

Component Pascal native x64 compiler http://herschel.oberon.org/
User avatar
adimetrius
Posts: 68
Joined: Sun Aug 04, 2019 1:02 pm

Re: Hershel 64-bit compiler

Post by adimetrius »

As far as the first proxy table entry being unused -

Code: Select all

PROCEDURE HrE.Init;
...
		proxyAvail := M.pointerSize; 	(* no proxy shall have address 0, since it cannot be written into Uses *)
User avatar
adimetrius
Posts: 68
Joined: Sun Aug 04, 2019 1:02 pm

Re: Hershel 64-bit compiler

Post by adimetrius »

Also, it seems I have not included into the repo a module that would allow to see a diagram in HrE - it explains the module's memory map. Will do.

Edit: Have done, added ChartViews.
X512
Posts: 72
Joined: Sat Feb 07, 2015 2:51 pm

Re: Hershel 64-bit compiler

Post by X512 »

First, the OCF format is great - I like it much better than the ELF, had to study both in the last 5 weeks.
Second, While we can preserve the overall OCF file structure, some things in it will still have to change. The reason is that OCF holds the image for the metadata, and that hold pointers, and they grow from 4 to 8 bytes.
adimetrius wrote: So my idea was that the processor code (a new one for amd64) could indicate that OCF holds 64-bit data. The new code would stop old tools from looking into the file and misinterpreting it.
It is better to set high bit of CPU field to indicate 64 bit module or something similar. 64 bit modules can be also used by aarch64, PowerPC, MIPS etc.. Various module tools (loaders, linkers, decoders) can work without knowledge of instruction set.
adimetrius wrote: Now, to the UseBlk. Currently, Herschel generates PIC - it needs no relocations when it's base is changed (when it's loaded at varying base addresses). It also does not need any binding fixups (to bind/link to imported symbols).

Addresses of imported CP procedures and external host library functions are collected in one place - the proxy table. References to the proxy table are hard-coded into the generated code.
Code fixups (fixups with type "relative") may be still needed for static linking to merge same OCF segments of multiple modules into one ELF segment. Proxy tables may be merged into one ELF GOT/GOT-PLT table with different protection. Hard-coded code offsets into proxy table may require a lot of ELF segments or mixing code with data.
adimetrius wrote: All this to say, the UseBlk's link list holds only one value.

This is how a UseBlk procedure entry was defined in OCF:
UProc = 4X name fprint link.
link = {fixupadr offset} 0X.

Now, in OCF/64, the fixupaddr is the address in the proxy table, and the offset is zero. (Also, fingerprinting is no yet implemented, so the fingerprints are written out as zeroes).
This violates current OCF fixup format. Fixup link address have specific format. If it positive, it it offset to code segment (0 < link < codeSize). If it negative, its negation is offset to meta segment (0 < -link < metaSize) and then to desc segment (metaSize <= -link < metaSize + descSize). Proxy segment can be added to this fix fixup link format, for example (metaSize + descSize <= -link < metaSize + descSize + proxySize). Zero proxy record will be not needed.
adimetrius wrote: Does this answer your question?
Thanks for explaining.
Last edited by X512 on Wed Dec 02, 2020 3:38 pm, edited 4 times in total.
X512
Posts: 72
Joined: Sat Feb 07, 2015 2:51 pm

Re: Hershel 64-bit compiler

Post by X512 »

adimetrius wrote:Edit: Have done, added ChartViews.
Last commit is still 7 days ago. "ChartViews" is not found in repository. Have you done "git push"?
User avatar
adimetrius
Posts: 68
Joined: Sun Aug 04, 2019 1:02 pm

Re: Hershel 64-bit compiler

Post by adimetrius »

Check out the dev branch. After several requests Ivan and I have decided that the master branch is to hold the most recent working version, while dev can be used for intermediary commits, including those that break working funcitonality or may not even be fully compilable. I've just checked:

Image

My bad - should've mentioned dev branch.
User avatar
adimetrius
Posts: 68
Joined: Sun Aug 04, 2019 1:02 pm

Re: Hershel 64-bit compiler

Post by adimetrius »

X512 wrote: It is better to set high bit of CPU field to indicate 64 bit module or something similar. 64 bit modules can be also used by aarch64, PowerPC, MIPS etc.. Various module tools (loaders, linkers, decoders) can work without knowledge of instruction set.
Well, at this moment, it essentially means a new processor code; but instead of 11 it would be 8000000BH. I don't have enough (any, actually) knowledge of the listed architectures, and I don't know if they have enough in common to be able to use the same (or similar) code file format. But, I'm more than willing to use 8000000BH as the code for amd64 instead of 11 ).

Any objections or considerations from others in our community?
User avatar
adimetrius
Posts: 68
Joined: Sun Aug 04, 2019 1:02 pm

Re: Hershel 64-bit compiler

Post by adimetrius »

X512 wrote: Code fixups (fixups with type "relative") may be still needed for static linking to merge same OCF segments of multiple modules into one ELF segment... Hard-coded code offsets into proxy table may require a lot of ELF segments or mixing code with data.
You're exactly right; here are my considerations regarding this issue.

First and foremost, Herschel is CP compiler for BlackBox. So, OCF files are it's primary concern. The compiler is fully capable of producing PIC that requires no base relocations and no binding fixups. This speeds up OCF loading and makes the OCF file smaller. So my inclination is to generate PIC into the OCF file, for use within BB.

However, ELF executables and libraries are also needed, which is the concern of a static linker (binder). If the binder is fed regular OCF files, then yes, it cannot perform static linking of OCF modules, and the ELF file will hold a lot of 'air'. My most frequent use of DevLinker is to generate the BlackBox executable, that is usually comprised of 3-5 modules. Given that it's size is ~100KB, I'm not overly concerned.
However, I imagine there may be use use cases/applications where it would be critical to produce the most compact ELF/PE. For such cases, I envision Herschel backend will provide additional fixup information at the request of the binder. Probably something like:

PROCEDURE HrCompiler.PrepToBind* (mod: TextModels.Model; OUT ocf: Files.File; OUT fixups: Fixups);

This way, the most efficient and compact OCF/ELF/PE can be generated.
User avatar
adimetrius
Posts: 68
Joined: Sun Aug 04, 2019 1:02 pm

Re: Hershel 64-bit compiler

Post by adimetrius »

X512 wrote:Proxy tables may be merged into one ELF GOT/GOT-PLT table with different protection.
I really really don't want to mess with GOT-PLT, here's why: inside BB, we can do everything without these facilities. Looks to me like we could do without them even in ELFs. Why mess with them then? Keep it as simple as possible, eh? So unless GOT-PLT provide a critically important advantage (which I'm not aware of), I'd stay away from them and keep the binder simpler.

But, I would fully agree that if OCF modules are statically bound (as I laid out in a previous post), all their proxy tables may and should be merged into one.
User avatar
adimetrius
Posts: 68
Joined: Sun Aug 04, 2019 1:02 pm

Re: Hershel 64-bit compiler

Post by adimetrius »

X512 wrote: Fixup link address have specific format. If it positive, it it offset to code segment (0 < link < codeSize). If it negative, its negation is offset to meta segment (0 < -link < metaSize) and then to desc segment (metaSize <= -link < metaSize + descSize). Proxy segment can be added to this fix fixup link format, for example (metaSize + descSize <= -link < metaSize + descSize + proxySize). Zero proxy record will be not needed.
I like this idea very much! Don't see any problems with it! Will incorporate!
X512
Posts: 72
Joined: Sat Feb 07, 2015 2:51 pm

Re: Hershel 64-bit compiler

Post by X512 »

adimetrius wrote: For such cases, I envision Herschel backend will provide additional fixup information at the request of the binder. Probably something like:
Code segment for 64 bit modules do not contains relocation chains and can be directly executed if proxy segment is not moved. So we can change link address meaning for code range (0 < link < codeSize). Fixups are CPU-dependent so it is fine to have different fixup types for each CPU type. Instead of reading fixup type at code, we can assume "relative" fixup type and have only offset without fixup type at code offset to proxy segment. Then we can add proxy fixup to existing 6 fixups (NewRecLink NewArrLink MetaLink DescLink CodeLink VarLink) in OCF header. It can be ignored if proxy segment is not moved.

Something like this:

Code: Select all

PROCEDURE Fixup (adr: INTEGER; mod: ModSpec);
	VAR link, offset, linkadr, t, n, x, low, hi: INTEGER;
BEGIN
	RNum(link);
	WHILE link # 0 DO
		RNum(offset);
		WHILE link # 0 DO
			IF link > 0 THEN
				linkadr := mod.mad + mod.ms + link; (* code *)
				t := relative;
				n := 0;
			ELSE link := -link;
				IF link < mod.ms THEN linkadr := mod.mad + link (* meta *)
				ELSE linkadr := mod.dad + link - mod.ms (* desc *)
				END
				S.GET(linkadr, x); t := x DIV 1000000H; (* x = t'8 n'24, t - unsigned, n - signed *)
				n := (x + 800000H) MOD 1000000H - 800000H;
			END;
			IF t = absolute THEN x := adr + offset
			ELSIF t = relative THEN x := adr + offset - linkadr - 4
			ELSIF t = copy THEN S.GET(adr + offset, x)
			ELSIF t = table THEN x := adr + n; n := link + 4
			ELSIF t = tableend THEN x := adr + n; n := 0
			ELSIF t = deref THEN S.GET(adr+2, x); INC(x, offset);
			ELSIF t = halfword THEN
				x := adr + offset;
				low := (x + 8000H) MOD 10000H - 8000H;
				hi := (x - low) DIV 10000H;
				S.GET(linkadr + 4, x);
				S.PUT(linkadr + 4, x DIV 10000H * 10000H + low MOD 10000H);
				x := x * 10000H + hi MOD 10000H
			ELSE Error(syntaxError, mod, NIL)
			END;
			S.PUT(linkadr, x); link := n
		END;
		RNum(link)
	END
END Fixup;
Post Reply