diff --git a/CNAME b/CNAME new file mode 100644 index 0000000..ac9b5f0 --- /dev/null +++ b/CNAME @@ -0,0 +1 @@ +psx-spx.consoledev.net \ No newline at end of file diff --git a/docs/cdromdrive.md b/docs/cdromdrive.md index a51833c..bdf409a 100644 --- a/docs/cdromdrive.md +++ b/docs/cdromdrive.md @@ -1820,6 +1820,28 @@ before the cdrom controller receives and skips further sectors). Otherwise sectors would be lost without notice (there appear to be absolutely no overrun status flags, nor overrun error interrupts).
+#### Update: +This is confirmed, as found in the SCEA_BBS.pdf bulletin board archive: +``` +12/29/95 10:24 AM +Re(4): CD buffer +Thomas Boyd +CD +Dan Burnash +OK. This is the story of the CD ROM subsystem sector buffer: +The CD-ROM subsystem sector buffer is currently 32K. It is located in the CD-ROM subsystem. +It uses a sort-of tripple buffering system to read sectors in and make one (and ONLY one) sector +available to the user. +Common questions that spring to mind and their answers: +Q: 32K - (2352 bytes/sector)*(3 buffered sectors) = lots of leftover RAM! Can I use it? A: No. It is +not accessible by anything but the CD-ROM subsystem. +Q: How dissappointing. As consolation, can I be told what the extra memory is used for? A: The +memory was going to be used for sound mapping, but (1) the system would be too slow, and +(2) sound mapping is already done by the SPU. The current implementation of this memory is ... +nothing. It is vestigal and will be cut out in future manufacturing cost reduction designs. +Tom +``` + #### Sector Buffer Test Cases ``` Setloc(0:2:0)+Read diff --git a/docs/graphicsprocessingunitgpu.md b/docs/graphicsprocessingunitgpu.md index 37ccabf..9799c77 100644 --- a/docs/graphicsprocessingunitgpu.md +++ b/docs/graphicsprocessingunitgpu.md @@ -445,6 +445,7 @@ write-protected, and cannot be overwritten by (new) rendering commands.
The mask setting affects all rendering commands, as well as CPU-to-VRAM and VRAM-to-VRAM transfer commands (where it acts on the separate halfwords, ie. as for 15bit textures). However, Mask does NOT affect the Fill-VRAM command.
+This setting is used in games such as Metal Gear Solid and Silent Hill. #### Note GP0(E3h..E5h) do not take up space in the FIFO, so they are probably executed @@ -502,6 +503,8 @@ Fill does NOT occur when Xsiz=0 or Ysiz=0 (unlike as for Copy commands). Xsiz=400h works only indirectly: Param=400h is handled as Xsiz=0, however, Param=3F1h..3FFh is rounded-up and handled as Xsiz=400h.
+Note that because of the height (Ysiz) masking, a maximum of 511 rows can be filled in a single command. Calling a fill with a full VRAM height of 512 rows will be ineffective as the height will be masked to 0. + #### Masking for COPY Commands parameters ``` Xpos=(Xpos AND 3FFh) ;range 0..3FFh @@ -594,7 +597,7 @@ command.
``` 0-23 Not used (zero) ``` -Resets the command buffer.
+Resets the command buffer and CLUT cache.
#### GP1(02h) - Acknowledge GPU Interrupt (IRQ1) ``` @@ -644,7 +647,7 @@ size=(X2-X1/cycles\_per\_pix), (Y2-Y1).
#### GP1(06h) - Horizontal Display range (on Screen) ``` - 0-11 X1 (260h+0) ;12bit ;\counted in 53.222400MHz units, + 0-11 X1 (260h+0) ;12bit ;\counted in video clock units, 12-23 X2 (260h+320*8) ;12bit ;/relative to HSYNC ``` Specifies the horizontal range within which the display area is displayed. For @@ -660,11 +663,12 @@ due to programming bugs). Pandemonium 2 is using a bigger "overscan" width The 260h value is the first visible pixel on normal TV Sets, this value is used by MOST NTSC games, and SOME PAL games (see below notes on Mis-Centered PAL games).
+Video clock unit used depends on console region, regardless of NTSC/PAL video mode set by GP1(08h).3; see section on [nominal video clocks](#nominal-video-clock) for values.
#### GP1(07h) - Vertical Display range (on Screen) ``` - 0-9 Y1 (NTSC=88h-(224/2), (PAL=A3h-(264/2)) ;\scanline numbers on screen, - 10-19 Y2 (NTSC=88h+(224/2), (PAL=A3h+(264/2)) ;/relative to VSYNC + 0-9 Y1 (NTSC=88h-(240/2), (PAL=A3h-(288/2)) ;\scanline numbers on screen, + 10-19 Y2 (NTSC=88h+(240/2), (PAL=A3h+(288/2)) ;/relative to VSYNC 20-23 Not used (zero) ``` Specifies the vertical range within which the display area is displayed. The @@ -674,9 +678,9 @@ to generate vblank interrupts (IRQ0).
The 88h/A3h values are the middle-scanlines on normal TV Sets, these values are used by MOST NTSC games, and SOME PAL games (see below notes on Mis-Centered PAL games).
-The 224/264 values are for fullscreen pictures. Many NTSC games display 240 -lines (overscan with hidden lines). Many PAL games display only 256 lines -(underscan with black borders).
+The 240/288 values are for fullscreen pictures. Many NTSC games display 240 +lines, but on most analog television sets, only 224 lines are visible (8 lines of overscan on top and 8 lines of overscan on bottom). Many PAL games display only 256 lines (underscan with black borders).
+Some games such as Chrono Cross will occasionally adjust these values to create a screen shake effect, so proper emulation of this command is necessary for those particular cases.
#### GP1(08h) - Display mode ``` @@ -1150,8 +1154,7 @@ from 16 to 31.
## GPU Texture Caching The GPU has 2 Kbyte Texture Cache
-The Texture Cache is (maybe) also used for CLUT data - or is there a separate -CLUT Cache - or is the CLUT uncached - but that'd be trash?
+There is also a CLUT cache that is preserved between GPU drawing commands. The CLUT cache is invalidated when different CLUT index values are used or when GP0(01h) is issued. It is unknown if the CLUT cache overlaps or is shared with the Texture Cache. If polygons with texture are displayed, the GPU needs to read these from the frame buffer. This slows down the drawing process, and as a result the number @@ -1201,6 +1204,47 @@ but also on entry 9 of block 2, these cannot be in the cache at once.
## GPU Timings +#### Nominal Video Clock + +``` + NTSC video clock = 53.693175 MHz + PAL video clock = 53.203425 MHz +``` +Consoles will always use the video clock for its region, regardless of the GPU being configured in NTSC or PAL output mode, because an NTSC console lacks a PAL reference clock and vice versa. Without modifications for an additional oscillator for the other region, consoles may experience drift over time when playing content from a different video region. See vertical refresh rates below. + +#### Vertical Video Timings +``` + 263 scanlines per field for NTSC non-interlaced + 262.5 scanlines per field for NTSC interlaced + + 314 scanlines per field for PAL non-interlaced + 312.5 scanlines per field for PAL interlaced +``` +Horizontal blanking and vertical blanking signals occur on the video output side as expected for NTSC/PAL signals. These are not necessarily the same as the timmer/interrupt HBLANK and VBLANK. + +#### Vertical Refresh Rates +``` + NTSC mode on NTSC video clock + Interlaced: 59.940 Hz + Non-interlaced: 59.826 Hz + + PAL mode on PAL video clock + Interlaced: 50.000 Hz + Non-interlaced: 49.761 Hz + + NTSC mode on PAL video clock + Interlaced: 59.393 Hz + Non-interlaced: 59.280 Hz + + PAL mode on NTSC video clock + Interlaced: 50.460 Hz + Non-interlaced: 50.219 Hz +``` +For emulation purposes, it's recommended to use an NTSC video clock when running NTSC content (or in NTSC mode) and a PAL clock when running PAL content (or in PAL mode). + +TODO: Derivations for vertical refresh rates; horizontal timing notes + +**Nocash's original GPU Timings notes:** #### Video Clock The PSone/PAL video clock is the cpu clock multiplied by 11/7.
``` diff --git a/docs/interrupts.md b/docs/interrupts.md index 65f43af..1a5dbdb 100644 --- a/docs/interrupts.md +++ b/docs/interrupts.md @@ -15,7 +15,7 @@ Mask: Read/Write I\_MASK (0=Disabled, 1=Enabled)
7 IRQ7 Controller and Memory Card - Byte Received Interrupt 8 IRQ8 SIO 9 IRQ9 SPU - 10 IRQ10 Controller - Lightpen Interrupt (reportedly also PIO...?) + 10 IRQ10 Controller - Lightpen Interrupt. Also shared by PIO and DTL cards. 11-15 Not used (always zero) 16-31 Garbage ``` diff --git a/docs/soundprocessingunitspu.md b/docs/soundprocessingunitspu.md index c3550af..137f924 100644 --- a/docs/soundprocessingunitspu.md +++ b/docs/soundprocessingunitspu.md @@ -12,7 +12,7 @@ [SPU Reverb Formula](soundprocessingunitspu.md#spu-reverb-formula)
[SPU Reverb Examples](soundprocessingunitspu.md#spu-reverb-examples)
[SPU Unknown Registers](soundprocessingunitspu.md#spu-unknown-registers)
- +[SPU Internal Timing](soundprocessingunitspu.md#spu-internal-timing)
## SPU Overview @@ -702,9 +702,14 @@ reverb write(s) are triggering interrupts.
Data Transfers (usually via DMA4) to/from SPU-RAM do also trap SPU interrupts.
#### Note -IRQ Address is used by Metal Gear Solid, Legend of Mana, Tokimeki Memorial 2, -Crash Team Racing, The Misadventures of Tron Bonne, and (somewhat?) by Need For -Speed 3.
+The IRQ Address is used in the following games (not exhaustive): +Metal Gear Solid: Dialogue and Konami intro. +Legend of Mana +Hercules: the memory card loading screen's lip sync. +Tokimeki Memorial 2 +Crash Team Racing: Lip sync, requires capture buffers. +The Misadventures of Tron Bonne: Dialogues. +Need For Speed 3: (somewhat?).
@@ -984,5 +989,93 @@ seem to modify the registers at any time during sound output, nor reverb calculations, nor activated external audio input... the registers seem to be just some kind of general-purpose RAM.
+## SPU Internal State Machine from SPU RAM Timing +### Introduction + +The 33.8 Mhz clock of the PSX is a well chosen value. +It is exactly 768 x 44.1 Khz = For each audio sample in CD quality, there are 768 cycles of system clock. +So, the state machine has to repeat its complete cycle every 768 system clock cycles. + +Now the full job to do within those 768 cycles: +- 24 channels to process. +- Reverb to compute and write back. +- Write back to voice 1 / 3, audio CD L/R. +- Do transfer from/to CPU bus of SPU RAM data if asked. + +### First look at the data from logic analyzer. + +By looking at the signal of the SPU RAM chip, it is possible to figure out what it is reading and writing. +- A read or a write to the SPU Ram is happening in 8 clock cycles. (Did not check in detail, but probably allow refresh and everything) +- Each channel is using 24 cycles. (3 operations of 8 cycles) + - Has TWO read for the current ADPCM block : one to the header of the currently played ADPCM block, one to the current 16 bit of the ADPCM. + - A unrelated READ (see later) +- 8 Cycle for each operation : WRITE CD Left, WRITE CD Right, Voice 1 WRITE, Voice 3 WRITE. +- Reverb operations : 14 memory operations of 8 cycles. + +### Sequence of work + +When doing the analysis from data, it is possible to figure out what are the operations, in what order they are done. +But it is not possible to figure out what is the FIRST operation in the loop. +So we arbitrarely decide to start the loop at 'Voice 1' (voice being from 1 to 24). + +- Voice 1 +- Write CD Left +- Write CD Right +- Write Voice 1 +- Write Voice 3 +- Reverb +- Voice 2 +- Voice 3 +- Voice 4 +- ... +- ... +- Voice 23 +- Voice 24 + +As written earlier, each Voice is 3x RAM access (one unrelated), reverb is 14x RAM access, then 4x RAM access for the all write. +### What we can guess from those information. +- If system wants to keep reverb done in the end, and write in sync against Voice 1 and 3, then the loop would most likely start at Voice 2. +- ADPCM decoder has to keep ADPCM decoder internal state about the samples. As the algorithm depends on the previous value inside a block, it can't do a direct access to a given sample in the block. +- We also understand how reverb is using 22 Khz because of the lack of bandwidth to do everything in 768 cycles if done at 44.1 Khz. +- Even when voices are not active, they always read something. It is possible to guess that the sample is simply ignored at some point in the data path (volume set to zero internally or mux not selecting the value). Interestingly, it may be possible if garbage is introduced in those read, to know how it is cancelled (enabling suddenly the channel and reading the sample out of the channel 1 or 3) -> DSP keeps history of sample for Gaussian Interpolation. + +### Reverb Computation Order #### +``` + [Left Side] [Right Side] +READ REVERB dLSame dRSame +READ REVERB mLSame-1 mRSame-1 +READ REVERB dRDiff dLDiff +XXXX REVERB mLSame mRSame <-- WRITE becomes READ if REVERB DISABLED. +READ REVERB mLDiff-1 mRDiff-1 +READ REVERB mLComb1 mRComb1 +XXXX REVERB mLDiff mRDiff <-- WRITE becomes READ if REVERB DISABLED. +READ REVERB mLComb2 mRComb2 +READ REVERB mLComb3 mRComb3 +READ REVERB mLComb4 mRComb4 +READ REVERB mLAPF1 - dAPF1 mRAPF1 - dAPF1 +READ REVERB mLAPF2 - dAPF2 mRAPF2 - dAPF2 +XXXX REVERB mLAPF1 mRAPF1 <-- WRITE becomes READ if REVERB DISABLED. +XXXX REVERB mLAPF2 mRAPF2 <-- WRITE becomes READ if REVERB DISABLED. +``` +We anticipate that the easiest way in hardware to disable/enable the REVERB function would be to switch those WRITE into READ. + +### Voices +``` +Read Header word in current ADPCM block. +Read Current Sample 16 bit word in current ADPCM block. +Read [UNRELATED ADR ? Not related to current block...] +``` + +### Notes +- Remaining cycles. + - With 24x8 + 4x8 + 14x8 = 720 cycles out of 768 cycles. + - That would mean 6 READ/WRITE should still be possible. +- UNRELATED READ in voices : probably used for transfer from [CPU->SPU RAM] or [SPU RAM->CPU] + - That would equate to a transfer performance of 24 x 2 byte x 44100 Khz = 2,116,800 bytes/sec +- The fixed READ timing would explain also why CPU can't read directly SPU RAM. As the SPU need to be the master to push the data. + - It only works with DMA waiting for the data to be sent. + +Everything is not fully clear yet, testing of SPU with proper tests to validate/invalidate various assumption. +Our finding are based on a logic analyzer log using the PSX boot sounds, knowing the values of the registers thanks to emulators. diff --git a/docs/timers.md b/docs/timers.md index 04227ee..e442ee8 100644 --- a/docs/timers.md +++ b/docs/timers.md @@ -47,6 +47,7 @@ clock cycles when an IRQ occurs). In Toggle mode, Bit10 is set after writing to the Mode register, and becomes inverted on each IRQ (in one-shot mode, it remains zero after the IRQ) (in repeat mode it inverts Bit10 on each IRQ, so IRQ4/5/6 are triggered only each 2nd time, ie. when Bit10 changes from 1 to 0).
+The "free run" mode is simply saying that the counter will not reset at a given threshold value. #### 1F801108h+N\*10h - Timer 0..2 Counter Target Value (R/W) ```