bu_cms_history/SN_046

SiteMap (Historical BU CMS wiki main page)

S/N 46

2007-09-17 (hazen, 904)

"Deep firmware problems", LRB4 problems.

Try to read JTAG chain from VME:

  ./DCCrepair.exe sbs:0 8 -y -j log3 idcode.jam

  *** Chain Continuity Failure (2) -- IR is returning with TDO all zeros   ***

OK, let's try the cable. No go.

2007-09-18 (hazen, 904)

Add power-on reset jumper. Reprogram CPLD with cable to v2. Disconnect, cycle power. Still can't access JTAG chain with either VME or JTAG cable(s). ????

Reconnect jumper, power up again. Program PCI 1/2/3 using Altera cable. Disconnect jumper (leaving power on). Program PCI 1/2/3 again. Now use DCCprogrammer to program LOG1/2/3 flash to latest.

Now we're getting somewhere:

  cmsmoe4
 ~/hazen/hcal_3_9_7/TriDAS/hcal/hcalDCC/tool > ./DCCprogrammer.exe sbs:0 8 -i
  ...
  INFO - This is a DCC v4
  ** Flash access OK **
  ** Firmware Revisions:
        LOG1: 0x000c (12)
        LOG2: 0x000b (11)
        LOG3: 0x0015 (21)
        MIP1: 0x001b (27)
        MIP2: 0x001b (27)
        MIP3: 0x001b (27)
        MIP4: --------
        MIP5: 0x001b (27)
      XILINX: 0x2c0f (11279)
        CPLD:   0x02 (status bits=0x00)
   serial no: 46

OK, update Xilinx to 2c11 while we're here.

Now, the MIP4 problem. Check it with JTAG. Looks OK:

  ./DCCrepair.exe sbs:0 8 -y -j mip4 idcode.jam

  ...
  INFO (JAM PLAYER): DevSel = 0xb Nbit = 0 TCK_freq = 0x1
  CRC mismatch: expected BAD6, actual 76E6
  ******************************************************************************
  * Altera Chain Interrogation Version 2.02                                    *
  *   Copyright (c) 1999-2001 Altera Corporation.  All Rights Reserved.        *
  *   Modified 20 Aug 2007 by E. Hazen to recoginize HCAL/DCC Devicessss       *
  ******************************************************************************
  Chain Continuity Checker
  DCC JAM Player running. Please wait
    Chain Continuity during IR is not stuck at zero or one
  ******************************************************************************
  Chain Length -- Load IR of all ones then count DR length
    Number of Devices is 1
  ******************************************************************************
  IR Length Calculator
    Instruction Register Length is 10
  ******************************************************************************
  IDCODE Reader
    ---------- | ---- ------------------- ------------- - |
    TDO -> TDI | Rev  Device              Mfgr          1 |
    ---------- | ---- ------------------- ------------- - |
    Device #1  | 0000 0001 0000 0000 0010 0000 1101 110 1 |
    ---------- | ---- ------------------- ------------- - |
  ******************************************************************************
  Device Identifier -- Search for device name from list of device IDCODE values
    ---------- |      ------------------- -------------   |
    TDO -> TDI |      Device              Mfgr            |
    ---------- |      ------------------- -------------   |
    Device #1  |      EPC2                Altera          |
    ---------- |      ------------------- ------------- - |
  ******************************************************************************
  Exit code = 0... Success

  DCC JAM closing JTAG hardware.
  Elapsed time = 00:00:01

OK, so reprogram it (could do this with LRBprog too):

  ./DCCrepair.exe sbs:0 8 -y -j mip4 lrbv27.jam

  JTAG initialization
  VSI0 base of 0x28000000 will be used to access PCI configuration space
  check:  LC ID=00040070 (ok)
  JTAG control register access check: (ok)
  VME Device created for JTAG access
  Check read 00007fff from JTAG control using new bus adapter
  .
  Checking file lrbv27.jam for valid ACTION statements
    Found action program
    Found action blankcheck
    Found action verify
    Found action erase
    Found action read_usercode
    Found action init_configuration
  6 actions found in JAM file
  .
  Found multiple possible actions in JAM file.  Please choose one:
  0: program
  1: blankcheck
  2: verify
  3: erase
  4: read_usercode
  5: init_configuration
  Your choice: 0
  About to perform action program on device mip4
  BogoMips reported as  4784.12
  Test delay was   0.31
  Delay calibrated for Jam player
  Jam STAPL Player Version 2.3
  Copyright (C) 1997-2000 Altera Corporation
  .
  INFO (JAM PLAYER): DevSel = 0xb Nbit = 0 TCK_freq = 0x1
  DCC JAM Player running. Please wait
  Device #1 Silicon ID is A98(01)
  erasing EPC device(s)...
  programming EPC device(s)...
  .....................................verifying EPC device(s)...
  ......................................DONE
  Exit code = 0... Success
  .
  DCC JAM closing JTAG hardware.
  Elapsed time = 00:02:38
  69.730u 90.430s 2:48.40 95.1%   0+0k 0+0io 2978pf+0w

Looks ok. However, it's still missing:

  ./DCCprogrammer.exe sbs:0 8 -i

  INFO - This is a DCC v4
  ** Flash access OK **
  ** Firmware Revisions:
        LOG1: 0x000c (12)
        LOG2: 0x000b (11)
        LOG3: 0x0015 (21)
        MIP1: 0x001b (27)
        MIP2: 0x001b (27)
        MIP3: 0x001b (27)
        MIP4: --------
        MIP5: 0x001b (27)
      XILINX: 0x2c11 (11281)
        CPLD:   0x02 (status bits=0x00)
   serial no: 46

Take out the DCC and swap MIP3, MIP4 LRBs. LRB serial numbers were originally: MIP3: 260 MIP4: 262

MIP4 site still not responding. Try a PCI device scan with DCCrepair:

  cmsmoe4
 ~/hazen/hcal_3_9_7/TriDAS/hcal/hcalDCC/tool > ./DCCrepair.exe sbs:0 8 -v -r -c -b -p
  ...
  PCI configuration information
  VSI0 base of 0x28000000 will be used to access PCI configuration space
  Configuring PCI bus 0/1 bridge
  Configuring PCI bus 1/2 bridge
  Bus Dev Alias PCI ID     Device ID  CSR      BAR0     BAR1     BAR2     BAR3     BAR4     BAR5
    0   0  br3 ac21104c  PCI bridge  02100143 00000000 00000000 00020100 02000101 00000000 00000000
    0   1 log3 00030072    DCC LOG3  04000000 00000008 00000000 00000000 00000000 00000000 00000000
    0   2  uv2 00020201          ??  00402f21 00020201 00402f21 00020201 00402f21 00020201 00402f21
    0   3   bc ffffffff          ??
    0   4   lc 00040070  Local Ctrl  04000000 00000000 00000001 00000000 00000000 00000000 00000000
    1   0  br2 ac21104c  PCI bridge  02100143 00000000 00000000 00020201 02000101 00000000 00000000
    1   1 log2 00020072    DCC LOG2  04000000 00000000 00000000 00000000 00000000 00000000 00000000
    1   2 mip3 00400055         LRB  04000000 00000000 00000000 00000000 00000000 00000000 00000000
    1   3 mip4 ffffffff          ??
    1   4 mip5 00400055         LRB  04000000 00000000 00000000 00000000 00000000 00000000 00000000
    2   0   -- ffffffff          ??
    2   1 log1 00010072    DCC LOG1  04000000 00000000 00000000 00000000 00000000 00000000 00000000
    2   2 mip0 ffffffff          ??
    2   3 mip1 00400055         LRB  04000000 00000000 00000000 00000000 00000000 00000000 00000000
    2   4 mip2 00400055         LRB  04000000 00000000 00000000 00000000 00000000 00000000 00000000

MIP4 PCI id is ffffffff. This is a bad sign... likely a hardware problem. It is apparently not the LRB. Could be either a motherboard or logic board problem.

2008-01-07 (hazen, BU)

Power up and run DCCdiagnose. Get this error right away:

  ERROR - Did not find 'log2_mem' as needed in the master map
  DCC::initialize() returned false!

Try a PCI scan.

  cms2
 ~/dcc_firmware > DCCrepair.exe 11 -b -r -c -v -p
  ...
  Bus Dev Alias PCI ID     Device ID  CSR      BAR0     BAR1     BAR2     BAR3     BAR4     BAR5
    0   0  br3 ac21104c  PCI bridge  02100143 00000000 00000000 00020100 02000101 00000000 00000000
    0   1 log3 00030072    DCC LOG3  04000000 00000008 00000000 00000000 00000000 00000000 00000000
    0   2  uv2 00020201          ??  00402f21 00020201 00402f21 00020201 00402f21 00020201 00402f21
    0   3   bc --------
    0   4   lc 00040070  Local Ctrl  04000000 00000000 00000001 00000000 00000000 00000000 00000000
    1   0  br2 ac21104c  PCI bridge  02100143 00000000 00000000 00020201 02000101 00000000 00000000
    1   1 log2 ffffffff          ??
    1   2 mip3 00400055         LRB  04000000 00000000 00000000 00000000 00000000 00000000 00000000
    1   3 mip4 ffffffff          ??
    1   4 mip5 00400055         LRB  04000000 00000000 00000000 00000000 00000000 00000000 00000000
    2   0   -- ffffffff          ??
    2   1 log1 00010072    DCC LOG1  04000000 00000000 00000000 00000000 00000000 00000000 00000000
    2   2 mip0 ffffffff          ??
    2   3 mip1 00400055         LRB  04000000 00000000 00000000 00000000 00000000 00000000 00000000
    2   4 mip2 00400055         LRB  04000000 00000000 00000000 00000000 00000000 00000000 00000000

Hmm... log2 seems not to have survived the trip from CERN. Try reprogramming:

  > DCCprogrammer.exe 11 -p LOG2 pci2vb.hex

Still no log2!

  cms2
 ~/dcc_firmware/current > DCCprogrammer.exe 11 -i
  HCAL DCCprogrammer v1.2 rev 12 Feb 2007
  Slot number 11 specified
  ...
  ERROR - Did not find 'log2_mem' as needed in the master map
  INFO - This is a DCC v4
  ** Flash access OK **
  ** Firmware Revisions:
          LOG1: 0x000c (12)
          LOG2: --------
          LOG3: 0x0015 (21)
          MIP1: 0x001b (27)
          MIP2: 0x001b (27)
          MIP3: 0x001b (27)
          MIP4: --------
          MIP5: 0x001b (27)
        XILINX: 0x2c11 (11281)
          CPLD:   0x02 (status bits=0x00)
     serial no: 46

Oh, the logic board wasn't firmly seated. Now log2 is back, but still no mip4.

Try a different logic board (not very logical, but easy to do). Argh! Now the crate gives a -12V overcurrent! Take off the logic board, the MB still gives overcurrent. Sigh. Remove the lock-washer wedged between F3 (-12V fuse) and adjacent connector.

Now, with only MB and MIPs, still no MIP4.

2008-01-09 (hazen, BU)

R530 is missing on the back of the motherboard. Replace it. LRB/MIP4 is OK now.