bu_cms_history/CMSSW

SiteMap (Historical BU CMS wiki main page)

HOWTO find data

This is incomplete! I hope to post here a recipe for finding data from i.e. global runs.

actual data, scroll to the right to get run#. Click run# for more info. Once you choose a run#, proceed to "Updated recipe" below. from Jeremy, Dec 2007.
  • Message Logger

    CMSSW Links

    Updated recipe to get started

      # before you can run any dbsql commands etc you must have *some* CMSSW environment.
      # you can start with this one (which happens to be the current one as of July 2012):
      export SCRAM_ARCH=slc5_amd64_gcc462
      # (howto know what to set it to?  No idea, ask someone!)
      # ls /afs/cern.ch/cms for some clues to choices...
      cmsrel CMSSW_5_2_4
      cd CMSSW_5_2_4
      scramv1 b
      cmsenv
    

      # find datasets in a run
      dbsql "find dataset where run = 200243"
        (or)
      dbsql "find dataset where dataset like *Run2012*RAW"
    

      # Figure out what CMSSW version to use for a run (see above)
      dbsql "find release where dataset = /MinimumBias/Run2012C-v1/RAW"   # for example
      # CMSSW releases listed
      # if the release isn't one you already have, repeat the 'cmsrel' with the correct one.
    

      # find the files for a particular dataset and run
      dbsql "find file where dataset=/MinimumBias/Run2012C-v1/RAW and run=200243"
    

    Various Data Formats

    My understanding of this is still incomplete, but it seems there are at least 3 different data formats in use, each requiring a different CMSSW configuration file.

    1. The "pool" format for global runs from i.e. CASTOR.

       // Loads the events
       source = PoolSource {
         untracked vstring fileNames = { 'file:/data/001617DBD49A.root' }
       }
    

    2. The "NewEventStream" format

       source = NewEventStreamFileReader
       {
         untracked vstring fileNames =  { "file:/tmp/35173_1.dat" }
         int32 max_event_size = 2000000
         int32 max_queue_depth = 5
       }
    

    3. The "testbeam" format for private runs (i.e. files in /hcal/bigspool/usc)

      source = HcalTBSource
      {
        untracked vstring fileNames =  { "file:/export/data/USC/USC_035529.root" }
        untracked vstring streams = { "HCAL_DCC724" }
      }
    

    CASTOR Notes

    To copy a file from CASTOR to a remote machine without a temporary copy on i.e. lxplus:

      rfcat /castor/cern.ch/cms/store/data/....dat | ssh hazen@cms1.bu.edu 'cat > /tmp/35173_2.dat'
    

    If the copy doesn't start, check the CASTOR status:

      stager_qry -M /castor/cern.ch/...
    

    See the CASTOR User Guide if all else fails!