Jump to content



3

GCC for the TI


91 replies to this topic

#76 retroclouds OFFLINE  

retroclouds

    Stargunner

  • 1,095 posts
  • Location:Germany

Posted Tue Jul 26, 2011 1:24 PM

hhm..... I have 2 questions:

* What registers are used by the compiler generated program? Are all 16 registers used or are some of them "free" for own use?
* What (scratchpad) memory is used by the compiler generated program?

#77 insomnia OFFLINE  

insomnia

    Space Invader

  • 36 posts
  • Location:Pittsburgh, PA

Posted Tue Jul 26, 2011 11:00 PM

OK, the multiply bug was more involved than the earlier ones, but here you go:

In gcc-4.4.0/gcc/config/tms9900/tms9900.md. remove lines 1455 through 1484 (the "mulhisi3", and "*multhisi" patterns).

Replace them with this:

(define_insn "mulhisi3"
  [(set (match_operand:SI 0 "register_operand" "=r,r")
	(mult:SI (match_operand:HI 1 "register_operand" "r,r")
		 (match_operand:HI 2 "general_operand" "rR>,Q")))]
  ""
  {
    /* When both input operands are registers, we may need to swap them. */
    if(REG_P(operands[1]) && REG_P(operands[2]))
    {
      /* Check for forms like: r0 = r1 * r0 */ 
      if(REGNO(operands[0]) == REGNO(operands[2]))
      {
        /* Swap operands, otherwise we will emit code like:
             mov r1, r0
             mpy r0, r0

           instead of:
             mpy r1, r0
        */
        rtx temp = operands[1];
        operands[1] = operands[2];
        operands[2] = temp;
      }
    }

    if(REGNO(operands[0]) != REGNO(operands[1]))
    {
      output_asm_insn("mov  %1, %0", operands);
    }
    output_asm_insn("mpy  %2, %0", operands);
    return("");
  }
  [(set_attr "length" "2,3")])

The original code was an attempt to force the register allocator to use registers which were most convenient for the MPY instruction. Obviously,that didn't work out so well. This new code is less aggressive, accepting any register choice GCC may make. It also works in all optimization levels. An optional MOV instruction is now used to prepare for the multiply if the register allocator is not kind.

Moving on to questions...

Quote

* What registers are used by the compiler generated program? Are all 16 registers used or are some of them "free" for own use?

GCC will attempt to make maximum use of all 16 registers, so there's no guarantee that there any lying around unused.

If you have an assembly routine you would like to interface with C code, the information needed for that should be shown in earlier posts. If you would like more detailed information (calling convention, register usage and allocation order, etc) I'd be happy to let you know. I have a document I've been neglecting which should include all this stuff.

Quote

* What (scratchpad) memory is used by the compiler generated program?

GCC (or any compiler) is basically an engine to convert source code into assembly. That means you have complete control over what memory is used, and for what purpose. So unlike Basic, Java or Forth (I presume), there is no other code working behind the scenes you need to be aware of. All of the machine's resources are available to you, and any other limitations are of your own making.

In the example code I've posted earlier, the registers are located at >8300 by the ctr0 code. Except for what's used by the registers, all of scratchpad memory is available. If you wanted to, you could put the registers elsewhere in scratchpad or 8-bit memory with no impact on the C code.

You can store data anywhere in the system you like (like Lucien did in his bricks code). Using the linker, you can build your code to run from anywhere in memory. You could even put (small bits of) code into scratchpad and run from there for that extra performace boost (as I believe was done in Parsec).

Remember, C was originally designed to write device drivers and operating systems, so the sky's the limit here.

#78 retroclouds OFFLINE  

retroclouds

    Stargunner

  • 1,095 posts
  • Location:Germany

Posted Wed Jul 27, 2011 6:42 AM

Thanks for the update. :)

Good to know that there is no memory used, except for the registers.
Reason I'm asking is because I'm toying with the idea of interfacing my spectra2 assembly library with C code.

No rush though, first need to get the next release out and wanna get myself confident with C :)

#79 insomnia OFFLINE  

insomnia

    Space Invader

  • 36 posts
  • Location:Pittsburgh, PA

Posted Mon Aug 8, 2011 12:51 AM

OK, it's patch time again.

This patch includes the fixes for all bugs mentioned here since the last patch release in addition to a few I found on my own. The same patch and build directions used before are used for this one too.

Here's the changes in this release:

Fixed a bug with byte initializers, it was handling negative values wrongly
Fixed multiply bug, it was using the wrong registers
Changed frame pointer from R8 to R9. Frame was being lost
Byte reads from memory were assumed to be copied into register's LSB.
Fixed a problem with AND improperly modifying input values.
Fixed a bug where R11 was not saved if used as a data register.
Modified output to use hex values for all constants and addresses

I've also packaged up an ELF to EA5 converter and an example program made to run as an EA5 image. The program does the same useless flashing text thing that the cart example did. This was done to make the differences easier to spot. The changes made to the EA5 crt0 are a bit safer than the one used in the cart version (this one better handles zero size sections). I'll probably release a new version of the cart tool and example sometime soon which incorporates these changes.

The next thing on my list is to update all the documentation. Everything I've posted so far is still valid, but there are probably holes where some subjects need more description.

I also need to put together a library for the missing 32- but functions (multiply, divide, modulus, shift). These functions are already written and tested for the most part, so releasing them should be quick and easy.

Finally, I need to make my V9T9 disk management tool ready for public consumption. It currently works, and the disk images it creates were used to test the EA5 converter, but it's super hacky at the moment. Once I spruce it up a bit and turn it into a useful tool, I can send it out the door.

As always, the gory details are on my blog for those who are interested.

Attached Files



#80 lucien2 OFFLINE  

lucien2

    Chopper Commander

  • 160 posts
  • Location:Switzerland

Posted Mon Aug 8, 2011 2:18 PM

View Postinsomnia, on Mon Aug 8, 2011 12:51 AM, said:

I've also packaged up an ELF to EA5 converter

Great, thanks! :thumbsup:

#81 lucien2 OFFLINE  

lucien2

    Chopper Commander

  • 160 posts
  • Location:Switzerland

Posted Sat Aug 20, 2011 5:43 AM

I found 2 new bugs. :ponder:

void vdp_copy_from_sys(int index, char* src, int size) {
	char* end = src + size;
	VDP_ADDRESS_REG = index;
	VDP_ADDRESS_REG = (char)(index >> 8) | VDP_WRITE_FLAG;
	while(src < end) VDP_WRITE_DATA_REG = *src++;
}

void main() {
	int i=0;
	char c;
	FAC=0x900;
	gpl_link(0x18);
	FAC=0xB00;
	gpl_link(0x4A);
	while(1) {
		interrupts();
		if(key_scan(&c)) {
			c=random_byte(26)+65;
			vdp_copy_from_sys(i,&c,1);
			i++; if(i>=32*24)i=0;
			gpl_link(0x34);
		}
	}
}

Code generated for the call of "vdp_copy_from_sys" ("i" is in R9):

	movb r9, @>8C02   INCORRECT
	mov  r9, r2
	ori r2, >4000
	movb r2, @>8C02
	movb r1, @>8C00
	inc  r9
	ci   r9, >2FF

If I comment the line "c=random_byte(26)+65;" ("i" is in R2):

	mov  r2, r1
	swpb r1           CORRECT
	movb r1, @>8C02
	mov  r2, r1
	ori r1, >4000
	movb r1, @>8C02
	movb r3, @>8C00
	inc  r2
	ci   r2, >2FF

And the other bug, with the assembler.

gplws	equ	0x83E0

	def	kscan
kscan	lwpi	gplws
	bl	@>E
	lwpi	>8300
	b	*r11

If I put "gplws equ 0x83E0" after the "kscan" routine, there is no error message, but "kscan" does not work anymore.

Here is the complete source: Attached File  UTILS.zip   9.38K   6 downloads

#82 lucien2 OFFLINE  

lucien2

    Chopper Commander

  • 160 posts
  • Location:Switzerland

Posted Sun Aug 21, 2011 11:23 AM

I found another bug with the assembler. I think this one should be easier to correct. :) The STST instruction can't be assembled. The error message is "Error: missing comma separator".

I could not find the string "STST" or "LWPI" in the "complete_files" folder, where are the assembler instructions in the source?

#83 lucien2 OFFLINE  

lucien2

    Chopper Commander

  • 160 posts
  • Location:Switzerland

Posted Mon Aug 22, 2011 9:37 AM

View Postlucien2, on Sun Aug 21, 2011 11:23 AM, said:

I could not find the string "STST" or "LWPI" in the "complete_files" folder, where are the assembler instructions in the source?

Ok, I searched in the GCC source, but it's in the BINUTILS source.

I see that in the "parse_table level_4" structure there is "{parse_type_8a, "stst"}".
So, here is "parse_type_8a":

static void parse_type_8a(struct buffer *buffer, disassemble_info *info, char *text)
{
  //                  |15|14|13|12|11|10| 9| 8| 7| 6| 5| 4| 3| 2| 1| 0|
  // register, count  |Opcode                          | 0|Register   |

  info->fprintf_func (info->stream, "%s r%d", 
                      text, buffer->opcode & 0xF);
}

It seems correct, why is it waiting for a comma?

#84 insomnia OFFLINE  

insomnia

    Space Invader

  • 36 posts
  • Location:Pittsburgh, PA

Posted Tue Aug 23, 2011 1:21 AM

Actually, the code that you found was in the disassembler, which isn't much help. What you want is line 722 of binutils-2.19.1/gas/config/tc-tms9900.c

Change that line to
  { "stst", 0x02C0, {ARG_REGISTER,  ARG_NONE}},
and you are back in business.

The error you were seeing was due to the fact that this instruction was falsely insisting upon a second argument for STST. Something like this would have made it happy:
stst r0, >0000

It also seems like there is a problem with the SBO, SBZ and TB instructions. During the assembly process, the bit offsets are being reduced by half. They are currently using constants in the same way as JMP, which is wrong. I need to add a new constant type for these CRU instructions for correct operation. At least LDCR and STCR look right. I haven't had a chance to look at your other issues yet, but I should have some answers for you tomorrow.

#85 lucien2 OFFLINE  

lucien2

    Chopper Commander

  • 160 posts
  • Location:Switzerland

Posted Tue Aug 23, 2011 4:55 AM

View Postinsomnia, on Tue Aug 23, 2011 1:21 AM, said:

Actually, the code that you found was in the disassembler, which isn't much help. What you want is line 722 of binutils-2.19.1/gas/config/tc-tms9900.c

Thanks! I only checked the first file where it founds "stst". :D

Quote

I haven't had a chance to look at your other issues yet, but I should have some answers for you tomorrow.

There's no hurry, I'm not blocked with these ones.

#86 insomnia OFFLINE  

insomnia

    Space Invader

  • 36 posts
  • Location:Pittsburgh, PA

Posted Thu Mar 15, 2012 12:17 AM

Well, after a really long time without any signs of life from this project, it's patch time.

A big "thank you" goes out to Lucien. A lot of the updates here are a direct result of the effort he put into making Rush Hour. He did a great job wading through all the brokenness to make a functional game. Now it's time for everyone to benefit from that work.

New Binutils fixes in this release:

STST was incorrectly looking for two arguments
SBO, SBZ and TB incorrectly using constants
EQU'ed symbols sometimes replaced using wrong endianness

GCC fixes:

Fixed several word-to-byte conversion errors
Fixed "unrecognizable instruction" for zero comparison operations
Made optimizations for most comparison operations
Improved correctness of condition flag handling
Switch statements now work properly
Fixed divison and modulus, operands were used in wrong order
Fixed subtract, operands were occasionally used in wrong order
Fixed stack frame corruption when local variables are in use
Added optimizations for forms like (int Y)=((int)(char X))<<N

The patch and build procedures are the same as always. Development notes are on my blog for those who are interested.

Things are shaping up pretty well so far. (Yes it is taking forever, sorry about that.) I don't see any obvious holes to fill, or optimizations yet to do. At this point, I just need to exercize the compiler with larger programs and increase test coverage. If anyone finds a problem, or sees an area where improvements can be made, please let me know.

I'm continuing to work on related projects (disk management tool, libc library, documentation). There's still lots to do, so these updates will keep coming.

Attached Files



#87 sometimes99er OFFLINE  

sometimes99er

    Stargunner

  • 1,918 posts
  • Location:Denmark

Posted Thu Mar 15, 2012 1:35 AM

Great work.

insomnia said:

(int y) = ((int)(charx))<<8
sra r2, 8 12+2*8+4=32
sla r2, >8 12+2*8+4=32
total: 64 clocks, 4 bytes

Sorry for some maybe stupid questions. And sorry for not posting on your blog.

So "charx" must be something between >00 and >FF. Right ?

Looking at the first instruction, being SRA (Shift Right Arithmetic), I assume "charx" is represented in memory as a word (looking at the code it's in R2 at that stage) like something between >00xx and >FFxx ?

SRA fills vacated bit positions with original MSB (Most Significant Bit).

So "SRA R2,8" would turn >00xx into >0000, and >FFxx would be >FFFF ?

But wouldn't that make (int)(charx) wrong ? - Shouldn't >FF00 be turned into >00FF ?

#88 lucien2 OFFLINE  

lucien2

    Chopper Commander

  • 160 posts
  • Location:Switzerland

Posted Thu Mar 15, 2012 5:16 AM

View Postinsomnia, on Thu Mar 15, 2012 12:17 AM, said:

Well, after a really long time without any signs of life from this project, it's patch time.

Wonderful! :thumbsup:

Quote

At this point, I just need to exercize the compiler with larger programs and increase test coverage.

Count on me for that. :)

#89 insomnia OFFLINE  

insomnia

    Space Invader

  • 36 posts
  • Location:Pittsburgh, PA

Posted Thu Mar 15, 2012 8:07 AM

View Postsometimes99er, on Thu Mar 15, 2012 1:35 AM, said:

So "charx" must be something between >00 and >FF. Right ?

Looking at the first instruction, being SRA (Shift Right Arithmetic), I assume "charx" is represented in memory as a word (looking at the code it's in R2 at that stage) like something between >00xx and >FFxx ?

SRA fills vacated bit positions with original MSB (Most Significant Bit).

So "SRA R2,8" would turn >00xx into >0000, and >FFxx would be >FFFF ?

But wouldn't that make (int)(charx) wrong ? - Shouldn't >FF00 be turned into >00FF ?

The example was written assuming the values are in currently in registers, and doesn't use correct C syntax. The idea was to use shorthand and c-like pseudocode to get the idea across quickly. A real-life example would look something like this:

void do_something()
{
  char x;
  int y;
  ...
  y=((int)x)<<4;
  ...
}

You're right X must be a value between >00 and >FF, and if the X value is in memory, it need not occupy a full word. Once copied into a register (using MOVB or something) the value will be stored in the high byte.

What you wrote would be true for unsigned values, but not for signed ones.

>FFxx in a register can be interpreted as either (char)(-1) or (unsigned char)(255)
>FFFF is (int)(-1)
>00FF would be (int)(255)

There are optimizations for both of these, but I only used an example for signed values since the timings are the same and only differ by the SRA or SRL instruction.

The initial implementation would produce this code:
* Assume x has a value of -4 (>FC), and is stored in r2 as >FCxx
* y = (-4)<<4 = -4 * 16 = -64 = >FFC0
sra r2, 8   * Convert to signed integer (r2=FFFC)
sla r2, 4   * Left shift converted value (r2=FFC0)

The optimization emits this code:
* Assume x has a value of -4 (>FC), and is stored in r2 as >FCxx
* y = (-4)<<4 = -4 * 16 = -64 = >FFC0
sra r2, 4   * Shift into final position (r2=FFCx)
andi r2, >FFF0	* Mask unknown bits (r2=FFC0)

And finally, for unsigned values:
* Assume x has a value of 252 (>FC), and is stored in r2 as >FCxx
* y = 252<<4 = 252 * 16 = 4032 = >0FC0
srl r2, 4	* Shift into final position (r2=0FCx)
andi r2, >FFF0	* Mask unknown bits (r2=0FC0)

Since fewer bit shifts are required, the optimized code runs faster (I figure about 33% faster on average), but uses one additional code word.

#90 sometimes99er OFFLINE  

sometimes99er

    Stargunner

  • 1,918 posts
  • Location:Denmark

Posted Thu Mar 15, 2012 9:56 AM

Very clear explanation. Thanks.

#91 lucien2 OFFLINE  

lucien2

    Chopper Commander

  • 160 posts
  • Location:Switzerland

Posted Sun Mar 18, 2012 11:48 AM

I think a "small" tutorial to install GCC on Windows is missing. Here is one:

Spoiler


If you don't want to do all these steps, I prepared a compiled version (26MB) https://drivehq.com/...I99-GCC-1.5.zip

Here is how you install it:

1. Install Cygwin (22MB)
http://cygwin.com/setup.exe
Choose the second mirror in the list

Choose the following options (bin checkbox):
Category "Devel", "make"
Category "Libs", "libmpfr-devel"

2. Run Cygwin to create your home directory

3. Extract "TI99-GCC-1.5.zip" to your home directory "cygwin/home/yourname"

4. Copy "/cygwin/home/yourname/binutils/bin/tms9900-as.exe" to "cygwin/bin/as.exe"


To try it, change the paths in "~/RUSH_HOUR/Makefile" from "/home/-" to "/home/yourname" and type this at the Cygwin console:
$ cd ~/RUSH_HOUR
$ make

It creates "rush_our.ea5.bin" and "rush_hour.c.bin" in the "cygwin/home/yourname/RUSH_HOUR" directory.
The .c.bin file is a cartridge binary ready to use. The .ea5.bin must be converted to the TIFILE format with Ti99Dir

To check the generated assembly code, type this:
$ ~/gcc/libexec/gcc/tms9900/4.4.0/cc1 -O2 main.c

Edited by lucien2, Mon Mar 19, 2012 12:57 AM.


#92 lucien2 OFFLINE  

lucien2

    Chopper Commander

  • 160 posts
  • Location:Switzerland

Posted Mon Apr 16, 2012 10:12 PM

How do you put a single quote in a string constant?

With the TI assembler, you have to double it. GCC doubles it for the GNU assembler, but the assembler gives this error:
Junk at end of line, first unrecognized character is `''




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users