Update...I've got it figured out.
Instead of bumping a Y index for each of the 2 sprites, bump it's indirect vector LSBs instead. Y can therefore remain set at zero when setting up the sprite display (saves 4 more cycles). There was also a superfluous instruction which has been removed.
Because the indirect vectors are bumped, I stored the original values to the now-unused Player#def variables...and pulled them back after the display is drawn (so that they remain unchanged for the other 2 display frames).
As before, the scanline count is placed in the low nybble of PF0. However, EACH line must contain a new CTRLPF value...4 values per line (PF0, PF1, PF2, CTRL). This last value should contain ball size ($20) + priority ($00 = dark, $04 = normal) + display mode ($00 = mirrored/$01 = reversed). Not enough time to implement other bits, so you'll still need to set panels up in the setup routine (the store to CTRLPF can be removed, tho).
Here's the updated kernal...
PrintDisplay:;draw a frame...
STA HMCLR ;3 Clear horizontal motion
LDA Obj1X ;3 Position Player00 Sprite to the X Coordinate of Object1
LDX #$00 ;2
JSR PosSpriteX ;6
LDA Obj2X ;3 Position Player01 Sprite to the X Coordinate of Object2
INX ;2 (the subroutine PosSpriteX didn't change X)
JSR PosSpriteX ;6
LDA PlayerX ;3 Position Ball Strite to the X Coordinate of the Man
LDX #$04 ;2
JSR PosSpriteX ;6
STA WSYNC ;3 Wait for horizontal Blank
STA HMOVE ;3 Apply Horizontal Motion
STA CXCLR ;3 Clear Collision Latches
PrintDisplay_1:
LDX INTIM ;4 Wait for end of the current frame
BNE PrintDisplay_1 ;2
STX RoomDefIndex ;3 Set room definition index
STX GRP1 ;3 Clear any graphics for Player01
INX ;2
STX VDELP1 ;3 vertically delay Player 01
;----------------------------------------------------
;do not use these lines if RoomLo/Hi is not shared...
LDA Control ;3 Get 3-frame room number
AND #$7F ;2 (strip inactive game bit)
TAY ;2
LDA RoomDataTable0,Y ;4 Get the room gfx LSB pointer
STA RoomLo ;3
LDA RoomDataTable1,Y ;4 Get the room gfx MSB pointer
STA RoomHi ;3
;----------------------------------------------------
LDA Obj1Lo ;3 transfer low sprite1 pointer to temp
STA Player0Def ;3
LDA Obj2Lo ;3 transfer low sprite1 pointer to temp
STA Player1Def ;3
LDA PlayerY ;3 Get the Y Coordinate of the Man
SEC ;2
SBC #$04 ;2 And Adjust it (By Four Scan Lines) for printing
STA PlayerYadj ;3 (so Y Coordinate Specifies Middle)
LDY #$00 ;2 Set initial room definition index
LDX #$68 ;2
STX ScanLineCnt ;3 Set initial Scan Line Count
;NOTE: bankswitched kernal would begin here------------------------------------------
STA WSYNC ;3 Wait for horizontal Blank
STY VBLANK ;3 Clear any Vertical Blank
BPL Print_Top_Line ;2 always branch (X=$68)
Print_More_Lines:
LDY RoomDefIndex ;3 Get room definition index
STA WSYNC ;3 Wait for horizontal Blank
STX GRP0 ;3 Display Player00 definition byte (if Wanted)
STA ENABL ;3 Enable Ball (If Wanted)
Print_Top_Line:
LAX (RoomLo),Y ;5 Get first room definition byte (use LAX to preserve)
STA PF0 ;3 ...and display
INY ;2 bump index
LDA (RoomLo),Y ;5 Get next room definition byte
STA PF1 ;3 ...and display
INY ;2 bump index
LDA (RoomLo),Y ;5 Get next room definition byte
STA PF2 ;3 ...and display
INY ;2 bump index
LDA (RoomLo),Y ;5 Get row's CTRL value
STA CTRLPF ;3 Store to playfield control
TXA ;2 Get PF0 definition back
AND #$0F ;2 keep only low nybble of PF0
STA PixelCount ;3 Store to line counter
INY ;2 bump index
STY RoomDefIndex ;3 ...and save for Next Time
LAX ScanLineCnt ;3 Get the scan line. Use LAX for subrtaction later
DEX ;2 (save 3 cycles by using DEX instead of DEC)
PrintPlayer01:;Print Player01 (Object 02)
STX ScanLineCnt ;3 Store the scan line
LDY #$00 ;2 clear Y index for both indirect loads
CLC ;2 Use carry to kill off difference between X and A
SBC Obj2Y ;3 Have we reached Object2's Y Coordinate?
STA WSYNC ;3 Wait for horizontal Blank
BPL PrintPlayer00 ;2 ...If Not, Branch
LDA (Obj2Lo),Y ;5 Get the Next Player01 Definition byte
STA GRP1 ;3 ...and display
BEQ PrintPlayer00 ;2 If Zero then Definition finished
INC Obj2Lo ;5 Goto next Player01 definition byte
PrintPlayer00:;Print Player00 (Object01), Ball (Man) and Room.
TXA ;3 Get Current Scan Line
LDX #$00 ;2 Clear X (player 0 gfx)
SEC ;2
SBC Obj1Y ;2 Have we reached the Object1's Y coordinate?
BPL PrintPlayer00_1 ;2 If not then Branch
LAX (Obj1Lo),Y ;5 Get the Next Player00 definition byte (give to X)
BEQ PrintPlayer00_1 ;2 If Zero then Definition finished
INC Obj1Lo ;5 Goto next Player00 definition byte
PrintPlayer00_1:
LDA ScanLineCnt ;3 Get Scan line count
SEC ;2
SBC PlayerYadj ;2 Have we reached the Man's Y Coordinate?
AND #$FC ;2 Mask value to four either side (getting depth of 8)
BNE PrintPlayer00_2 ;2 If >4 on either end, branch (skip ball display)
LDA #$02 ;2 Enable Ball Graphic
PrintPlayer00_2:
DEC PixelCount ;5 decrement the line counter
BMI Print_More_Lines ;2 if more lines remain, branch
PrintPlayer00_3:
STA WSYNC ;3 Wait for horizontal Blank
STA ENABL ;3 Enable Ball (If Wanted)
STX GRP0 ;3 Display Player00 definition byte (if Wanted)
PrintPlayer00_4:
LAX ScanLineCnt ;3 Get the scan line. Use LAX for subrtaction later
DEX ;2 (save 3 cycles by using DEX instead of DEC)
CPX #$08 ;2 Have we reached to within 8 scanlines of the bottom?
BPL PrintPlayer01 ;2 If not, Branch
STX VBLANK ;3 Turn on VBLANK, now 3 cycles earlier to fix side pixel
;NOTE: bankswitched kernal would end here--------------------------------------------
LDA #$00 ;2
STA GRP1 ;3 Clear any graphics for Player01
STA GRP0 ;3 Clear any graphics for Player00
LDA Player0Def ;3 Restore low pointer vectors...
STA Obj1Lo ;3
LDA Player1Def ;3
STA Obj2Lo ;3
LDA #$20 ;2
STA TIM64T ;4 Start timing this frame
RTS ;6 return to main loop
Thanks for the suggestion!

This is a very good addition. Each screen will require at least 28 bytes...but it's still possible to share bytes if you are careful.
NOTE: Because the LSB's of the sprite vector is updated, it's -required- that sprites DO NOT cross a page boundry under any circumstances.