Difference between revisions of "Transtomo"
Tolumorayo (talk | contribs) |
Tolumorayo (talk | contribs) |
||
(6 intermediate revisions by the same user not shown) | |||
Line 93: | Line 93: | ||
compiling the mpi version of the code i.e. tomo mpi. | compiling the mpi version of the code i.e. tomo mpi. | ||
− | = | + | =Build parameter files= |
− | + | '''Run [{{#filelink: single_job.sh}} run_pythonSbatch.sh] script, which calls the single job sbatch shell [{{#filelink: single_job.sh}} runSingle_PhaseExpt.sh], and the python code <code> bldParamSingle.py</code> [{{#filelink: single_job.sh}} bldParamSingle.py] ''' | |
− | |||
− | Tolulope Olugboji ( | + | {{#fileanchor: run_pythonSbatch.sh}} |
+ | <source lang=bash> | ||
+ | #!/bin/tcsh | ||
+ | # run_pythonSbatch.sh | ||
+ | # Author: Tolulope Olugboji | ||
+ | # Date: March 6, 2015 | ||
+ | # | ||
+ | # Used to build parameter files, by running them on umd's deepthought2 | ||
+ | |||
+ | module load python/2.7.8 | ||
+ | |||
+ | # remplace with the following output directory for 3 expts - 1. Expt1_All 2. Expt2_RadialRal 3. Expt3_Seasons | ||
+ | |||
+ | |||
+ | set expts = (newExpts/RadialRal/ newExpts/Summer/ newExpts/Winter/) | ||
+ | set saveAs = (Expt2_RadialRal/ Expt3_Seasons/Sum/ Expt4_Seasons/Win/) | ||
+ | set use4slrm = (expt2 expt3 expt4) | ||
+ | set indxExpts = `seq 1 $#expts` # experiment iterator | ||
+ | |||
+ | echo "Single in Directory ..." $expts[1] | ||
+ | echo $saveAs[1] | ||
+ | echo $#expts | ||
+ | |||
+ | foreach iExpt ($indxExpts) | ||
+ | |||
+ | set inDirX = '/lustre/olugboji/buildParamFiles/USANT15/Measure/' | ||
+ | set outDirX = '/lustre/olugboji/buildParamFiles/USANT15/THBIParams/' | ||
+ | |||
+ | echo "START !!!!! Expt " $iExpt "-----------------------------------------------------" | ||
+ | echo "Input Dir ..." $inDirX$expts[$iExpt] | ||
+ | echo "Output Dir " $outDirX$saveAs[$iExpt] | ||
+ | |||
+ | set inDir = $inDirX$expts[$iExpt] | ||
+ | set outDir = $outDirX$saveAs[$iExpt] | ||
+ | |||
+ | set inLove = `ls $inDir` | ||
+ | set indxPhase = `seq 1 $#inLove` | ||
+ | #set indxPhase = `seq 12 21` | ||
+ | |||
+ | echo "all Love" $inLove[1] "length" $#inLove | ||
+ | echo $indxPhase | ||
+ | |||
+ | foreach iPhase ($indxPhase) | ||
+ | |||
+ | set file = $inLove[$iPhase] | ||
+ | echo $iPhase " : " $inLove[$iPhase] | ||
+ | setenv JOBNAME "$file" | ||
+ | setenv INDIR "$inDir" | ||
+ | setenv OUTDIR "$outDir" | ||
+ | |||
+ | set slurmOut = "/lustre/olugboji/buildParamFiles/USANT15/mpiOUT/slurm-$use4slrm[$iExpt]-$file.txt" | ||
+ | |||
+ | #sbatch --job-name=$JOBNAME --time=2-0 --output=$slurmOut --export=JOBNAME ./runSingle_PhaseExpt.sh | ||
+ | sbatch --job-name=$JOBNAME --time=2-0 --output=$slurmOut --export=ALL ./runSingle_PhaseExpt.sh | ||
+ | end | ||
+ | |||
+ | echo "END!!!!! Expt " $iExpt "-----------------------------------------------------" | ||
+ | end | ||
+ | </source> | ||
+ | |||
+ | |||
+ | {{#fileanchor: runSingle_PhaseExpt.sh}} | ||
+ | <source lang=bash> | ||
+ | #!/bin/tcsh | ||
+ | # runSingle_PhaseExpt.sh} | ||
+ | # Adjust memory/walltime/ncpus as necessary | ||
+ | # | ||
+ | # | ||
+ | #SBATCH --ntasks=1 | ||
+ | #SBATCH --mail-user=tolumorayo@gmail.com | ||
+ | #SBATCH --mail-type=ALL | ||
+ | #SBATCH -A ved-prj-hi | ||
+ | #SBATCH --share | ||
+ | |||
+ | module load python/2.7.8 | ||
+ | |||
+ | echo "*********SBATCH called on ... $JOBNAME $INDIR $OUTDIR" | ||
+ | |||
+ | ipython bldParamSingle.py $JOBNAME $INDIR $OUTDIR | ||
+ | |||
+ | </source> | ||
+ | |||
+ | |||
+ | {{#fileanchor: bldParamSingle.py}} | ||
+ | <source lang=python> | ||
+ | # coding: utf-8 | ||
+ | |||
+ | # bldParamSingle.py | ||
+ | |||
+ | # !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! | ||
+ | # Author: Tolulope Olugboji | ||
+ | # Date: Nov. 17, 2014 | Original: - | updated July, 17 | ||
+ | # !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! | ||
+ | # | ||
+ | # Objective: Parse Ekstrom USarry phase velocity data: | ||
+ | # construct files used by transdimensional tomography code tomo- Bodin etal | ||
+ | # | ||
+ | # sources.txt - file containing source lat and lon locations | ||
+ | # receivers.txt - fle containing receiver lat and lon locations | ||
+ | # paths.txt - file containing path vectors... | ||
+ | # | ||
+ | # use module load python/2.7.8 on deepthought2 | ||
+ | |||
+ | import numpy as np | ||
+ | import math | ||
+ | from obspy.core.util.geodetics import gps2DistAzimuth | ||
+ | from geographiclib.geodesic import Geodesic | ||
+ | from os import listdir | ||
+ | from os.path import isfile, join | ||
+ | import os | ||
+ | import sys | ||
+ | |||
+ | |||
+ | # parse directory and read sort files into Love and Raleigh txt files | ||
+ | def parseDir(inDir): | ||
+ | outLove = [] | ||
+ | outRal = [] | ||
+ | |||
+ | onlyfiles = [ f for f in listdir(inDir) if isfile(join(inDir,f)) ] | ||
+ | #print onlyfiles | ||
+ | |||
+ | for curFile in onlyfiles: | ||
+ | phaseCode = curFile[0:1] | ||
+ | |||
+ | if phaseCode == 'L': | ||
+ | outLove.append(curFile) | ||
+ | if phaseCode == 'R': | ||
+ | outRal.append(curFile) | ||
+ | |||
+ | return outLove, outRal | ||
+ | |||
+ | # Check if Directory exists, if it does return, otherwise create dir | ||
+ | def mkDirIfNE(inPath): | ||
+ | if not os.path.exists(inPath): | ||
+ | print "making directory .." | ||
+ | os.makedirs(inPath) | ||
+ | else: | ||
+ | print "Directory already exists" | ||
+ | |||
+ | |||
+ | # function that returns nPoints on great circle path between locA(lat, lon) and locB(lat2, lon2) | ||
+ | def returnPtsOnGCPath(locA, locB, nPoints=2, delta=0.2): | ||
+ | EARTH_R = 6371.000000 # Earth Radius | ||
+ | # Coordinates of airports | ||
+ | lat1, lon1 = locA[0], locA[1] # source | ||
+ | lat2, lon2 = locB[0], locB[1] # receiver | ||
+ | |||
+ | # Compute path information from locA to locB | ||
+ | p = Geodesic.WGS84.Inverse(lat1, lon1, lat2, lon2) | ||
+ | # define line information | ||
+ | l=Geodesic.WGS84.Line(p['lat1'],p['lon1'],p['azi1']) | ||
+ | # Compute midpoint starting at 1 | ||
+ | |||
+ | dlon = (lon2 - lon1) | ||
+ | dlat = (lat2 - lat1) | ||
+ | n = int(math.ceil(math.sqrt(dlon*dlon + dlat*dlat)/delta)) | ||
+ | |||
+ | if ( n < nPoints ): | ||
+ | nPoints = 1 | ||
+ | else: | ||
+ | nPoints = n | ||
+ | |||
+ | pntsOnPath = [] | ||
+ | for i in range(nPoints+1): | ||
+ | b=l.Position(i*p['s12']/nPoints) | ||
+ | pntsOnPath.append([ b['lon2'], b['lat2'], EARTH_R ]) | ||
+ | |||
+ | return np.asarray(pntsOnPath), nPoints+1 | ||
+ | |||
+ | #Define Data Directories -- Ekstrom phase velocity files... UPDATE ! POINT TO USANT15 | ||
+ | # see if arguments where passed successfully ... | ||
+ | print "!!!!!!!!!!! Called Build with argument !!!!!!!: ", sys.argv[1] | ||
+ | |||
+ | #RalFile = inRal[iFile] | ||
+ | RalFile = sys.argv[1] | ||
+ | inDir = sys.argv[2] | ||
+ | outDir = sys.argv[3] | ||
+ | |||
+ | nxtDir = RalFile[0:3] | ||
+ | mkDirIfNE(outDir + nxtDir) | ||
+ | |||
+ | ### Out Data File for transdimensional codee | ||
+ | outLoc = outDir+nxtDir+'/sources.dat' | ||
+ | outLocRec = outDir+nxtDir+'/receivers.dat' | ||
+ | outPaths = outDir+nxtDir+'/paths.dat' | ||
+ | outObsv = outDir+nxtDir+'/observations.dat' | ||
+ | outCodes = outDir+nxtDir+'/sourcesAnot.dat' | ||
+ | |||
+ | #### Read columnns of the data file | ||
+ | fileLoc = inDir + RalFile | ||
+ | staCodes = np.loadtxt(fileLoc, dtype='str', usecols= (0,)) | ||
+ | recCodes = np.loadtxt(fileLoc, dtype='str', usecols= (1,)) | ||
+ | staLocs = np.loadtxt(fileLoc, usecols= (2,3)) | ||
+ | recLocs = np.loadtxt(fileLoc, usecols=(4,5)) | ||
+ | |||
+ | #### Load observational data | ||
+ | pathDist = np.loadtxt(fileLoc, usecols= (6,)) # distance in km along source-receiver path | ||
+ | pathVel = np.loadtxt(fileLoc, usecols= (10,)) # phase velocity in km/sec along source-receiver path | ||
+ | pathDV = np.loadtxt(fileLoc, usecols= (11,)) # phase velocity error in sec along source-receiver path | ||
+ | |||
+ | |||
+ | allStaCode = len(staCodes) | ||
+ | allRecCode = len(recCodes) | ||
+ | #print "statAll:, ", allStaCode, allRecCode, staCodes[1], staLocs[1] | ||
+ | |||
+ | |||
+ | staSet = list(set(staCodes)) | ||
+ | recSet = list(set(recCodes)) | ||
+ | |||
+ | uniqLen = len(staSet) | ||
+ | uniqLenRec = len(recSet) | ||
+ | print "Unique Set: ", uniqLen, uniqLenRec, staSet[0:3], recSet[0:3] | ||
+ | |||
+ | |||
+ | storeStatCode = [] | ||
+ | storeStaLoc = [] | ||
+ | |||
+ | for indxUniq in range(uniqLen): | ||
+ | #print staSet[indxUniq] | ||
+ | for lookUp in range(allStaCode): | ||
+ | if (staSet[indxUniq] == staCodes[lookUp]): | ||
+ | #print staCodes[lookUp], staLocs[lookUp] | ||
+ | storeStatCode.append(staCodes[lookUp]) | ||
+ | storeStaLoc.append(staLocs[lookUp]) | ||
+ | break | ||
+ | |||
+ | print "After Parse: ", len(storeStatCode), len(storeStaLoc) | ||
+ | print "After Parse: ", storeStatCode[0:3], storeStaLoc[0:3] | ||
+ | |||
+ | headVal = len(storeStaLoc) | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | # save station and receiver locations [stations double as receivers in ambient noise tomography].. | ||
+ | # | ||
+ | # - uncomment below to save arrays to file | ||
+ | np.savetxt(outLoc, np.fliplr(storeStaLoc), header= str(headVal), fmt='%.3f', comments='') | ||
+ | np.savetxt(outLocRec, np.fliplr(storeStaLoc), header= str(headVal), fmt='%.3f', comments='') | ||
+ | |||
+ | # Save all possible paths and | ||
+ | |||
+ | nPointsOnPath = 2 | ||
+ | lenObserv = allStaCode | ||
+ | searchLen = lenObserv | ||
+ | lenPaths = len(storeStaLoc)*len(storeStaLoc) | ||
+ | |||
+ | countPath = 0 | ||
+ | countObsvPath = 0 | ||
+ | |||
+ | staCodeList = staCodes.tolist() | ||
+ | recCodeList = recCodes.tolist() | ||
+ | |||
+ | listPathDist = pathDist.tolist() | ||
+ | listPathVel = pathVel.tolist() | ||
+ | listPathDV = pathDV.tolist() | ||
+ | |||
+ | print lenPaths | ||
+ | observTable = [] | ||
+ | with file(outPaths, 'w') as outFilePath: | ||
+ | outFilePath.write( '%d \n' % lenPaths ) | ||
+ | |||
+ | obsRow = [0, 0.0, 0.0] | ||
+ | for staCode, staLoc in zip(storeStatCode, storeStaLoc): | ||
+ | for recCode, recLoc in zip(storeStatCode, storeStaLoc): | ||
+ | valProgress = float(countPath)/float(lenPaths) * 100.0 | ||
+ | #print "%0.5f " %valProgress, "%... done (", countPath, "of", lenPaths , ")" | ||
+ | |||
+ | obsRow = [0, 0.0, 0.0] | ||
+ | if staCode == recCode: | ||
+ | #print 'self Path' | ||
+ | nPoints= 1 | ||
+ | nPointsOnPath = 2 | ||
+ | path, nPointsOnPath = returnPtsOnGCPath(staLoc, staLoc,1, delta=0.2) | ||
+ | outFilePath.write( '%d \n' % nPointsOnPath ) | ||
+ | np.savetxt(outFilePath, path, fmt='%.3f') | ||
+ | else: | ||
+ | #scanConnectingPath(staCode, recCode) | ||
+ | #print "Look-Up", staCode, recCode, staLoc, recLoc | ||
+ | path,nPointsOnPath = returnPtsOnGCPath(staLoc, recLoc,2, delta=0.2) | ||
+ | |||
+ | nPoints= nPointsOnPath | ||
+ | outFilePath.write( '%d \n' % nPoints ) | ||
+ | np.savetxt(outFilePath, path, fmt='%.3f') | ||
+ | |||
+ | for iObserv in range(len(staCodeList)): | ||
+ | pathSta = staCodeList[iObserv] | ||
+ | pathRec = recCodeList[iObserv] | ||
+ | |||
+ | #for pathSta, pathRec in zip(staCodes, recCodes): | ||
+ | #print "Searching: ", pathSta, pathRec | ||
+ | if (staCode == pathSta and recCode == pathRec) or (staCode == pathRec and recCode == pathSta): | ||
+ | |||
+ | #timeObs = float(pathDist[iObserv]) / float(pathVel[iObserv]) | ||
+ | timeObs = listPathDist[iObserv] / listPathVel[iObserv] | ||
+ | #delObs = (timeObs / float(pathVel[iObserv]) ) * float(pathDV[iObserv]) | ||
+ | delObs = (timeObs / listPathVel[iObserv] ) * listPathDV[iObserv] | ||
+ | obsRow = [1, timeObs, delObs] | ||
+ | |||
+ | countObsvPath = countObsvPath + 1 | ||
+ | |||
+ | print "%0.5f " %valProgress, "%... done (", countPath, "of", lenPaths , ")" | ||
+ | print "Found: ", pathSta, pathRec | ||
+ | # Delete search index to make search smaller and tractable | ||
+ | del staCodeList[iObserv] | ||
+ | del recCodeList[iObserv] | ||
+ | |||
+ | del listPathDist[iObserv] | ||
+ | del listPathVel[iObserv] | ||
+ | del listPathDV[iObserv] | ||
+ | # End delete of search list .... | ||
+ | |||
+ | ### If path has valid observation then put code 1 in observation file, otherwise 0. | ||
+ | #searchLen = lenObserv - 1; | ||
+ | break | ||
+ | |||
+ | observTable.append(obsRow) | ||
+ | countPath = countPath + 1 | ||
+ | #clear_output(wait=True) | ||
+ | |||
+ | |||
+ | print "Done: Total Valid Paths = ", countObsvPath | ||
+ | np.savetxt(outObsv, observTable, fmt='%d %.3f %.3f') | ||
+ | |||
+ | </source> | ||
+ | |||
+ | = Setting up parameter and data files= | ||
+ | |||
+ | Now that the libraries and executable (binaries) environment is set up, the next stage is to set up the input parameters and the data environment necessary to stage the code on the cluster. There are two directories here: | ||
+ | |||
+ | The First is the <code> <Input/Output-Data/Exhaust-directory> </code> that stores the input data, parameter files needed to start the <code>tomompi</code? code and results output generated after successful completion of the analysis. The datasets are organized by phase-velocity e.g. <code><Input/Output-Data/Exhaust-directory>/<folder> </code> where <code><folder></code> represents the 22 phase velocity datasets: L05-L40 and R05-R40. Here is a listing of the data directory: | ||
+ | |||
+ | * <code><Input/Output-Data/Exhaust-directory>/<folder></code> | ||
+ | Holds the important observation data, source-receiver configuration and path location files e.g. observations.dat, paths.dat, sources.dat, etc. | ||
+ | |||
+ | * <code><Input/Output-Data/Exhaust-directory>/<folder>/results </code> | ||
+ | Stores the saved output after completed run e.g. output mean, Voronoi-partition location and velocity history, misfit history, model save state, etc.) | ||
+ | |||
+ | * <code> <Input/Output-Data/Exhaust-directory>/<folder>/restart </code> | ||
+ | Stores the model states (i.e. high resolution tesselated model states or intermediate model states) to be used in the case of a restart-able chain. | ||
+ | |||
+ | The second is the <run-shell-execute-directory> where the shell <code>SBATCH</code> scripts for staging the code on the cluster Here are the files in this directory: | ||
+ | * <code><run-shell-execute-directory>/tomo sbatch.sh </code>: | ||
+ | |||
+ | Script runs tomo mpi for all 22 phase velocity maps. You can tweak this script to run single maps, specify an initial number of partitions, or to determine if you want to restart chains from set stages or particular input models. This is actually a wrapper script to tomo single.sh that does all the heavy lifting. | ||
+ | * <code><run-shell-execute-directory>/tomo single.sh </code>: | ||
+ | Script used by tomo sbatch.sh above for running the tomo mpi on single | ||
+ | phase velocity maps, it also sets up all the required environment variables, | ||
+ | determines how many chains to run and sets up the required | ||
+ | ags. | ||
+ | |||
+ | * <code> <run-shell-execute-directory>/tomoParamMPI.nml: </code> | ||
+ | Parameter file used by tomo single.sh for setting up all the required input parameters used by tomo mpi. It sets up parameters like file paths, number of steps, distribution range, perturbation parameters, seed values for probability distribution, state of starting models etc. Note that we provide another parameter file: <code><run-shell-execute-directory>/rstrtTomoParamMPI.nml </code>for setting up parameters if the chain is to be restart from a specified initial model state (typically a high-resolution near-mean state). | ||
+ | |||
+ | * <code><run-shell-execute-directory>/dwnldRslts.sh: </code> | ||
+ | Script used to download results data from the remote deepthought2 cluster machine to the <outputData-prjDir> on the local machine for post-processing and visualization. |
Latest revision as of 16:40, 30 August 2017
Compile and Run Transtomo on Deepthought2 Tolulope Olugboji, olugboji@umd.edu March 31, 2016
Contents
Overview
This documentation outlines steps for compiling and running the extended version of transtomo
on the deepthought2 cluster at the University of Maryland. This version of transtomo
is a personal adaptation and improvement of the freely distributed copy on iEarth geophysics scientific software. This version includes the following improvements:
- Functionality to load, initialize, and restart chains from models stored in ASCII files
- Functionality to resample and save full chain history in ASCII files
- Functionality to lock cells that do not participate in the update of travel time data
Work is also ongoing to extend the code to invert for azimuthal anisotropy transtomo
. This documentation will provide details on this update when it's available. Feel free to contact me for any other questions on how to use
this code and set it up on the deepthought2 cluster here at the University of Maryland.
Directory Listing
Here is a listing of the important directories that need to be set up on the cluster to organize and set things up.
1 <Homedirectory> 2 /homes/username(olugboji) 3 <Codedatadirectory> 4 /lustre/username(olugboji)/transdimsurfacewavetomography/ 5 <Codebinarydirectory> 6 /home/deepthought2/username(olugboji)/bin/ 7 <Runshellexecutedirectory> 8 /homes/username(olugboji)/binShell 9 <Input/OutputData/Exhaustdirectory> 10 /lustre/username(olugboji)/tomoParamFiles/ 11 <rjMcMClibrarydirectory> 12 /home/deepthought2/olugboji 13 14 % Data that need to be backed up from the local store to the ... cluster directory .. 15 16 <localprjDir> 17 :/Documents/UMD Seismo/ 18 <localcodestore> 19 <localprjDir>iEarthSoftware/transdimsurfacewavetomography 20 <localdatastore> 21 <localprjDir>transdimUSANT/All Ta Sta/ 22 <localrestrtmodels> 23 <localprjDir>transdimUSANT/All Ta Sta/VorModels/inModels 24 25 %.. Data from cluster exhaust to local directory 26 0.11 27 0.22
Login into deepthought2
ssh -X username@login.deepthought2.umd.edu
Updating and compiling source code
The followng are the sequence of steps necessary to compile the and install the required libraries and other source code �les: 2 • Update the <code-data-directory> with the latest version of the source code (available on <local-prjDir> or @ <gitHub directory> (see direc- tory listing in 2 above) • Compile and install the RJMCMC library (<code-data-directory>/RJMCMC 1.0.11/) also see README �le. On the cluster this involves the following steps: > cd path to <code-data-directory>/RJMCMC 1.0.11/ > module load openmpi > ./configure --prefix=<rjMcMC-library-directory> > make > make install This is the only way to install the RJMCMC library on the cluster. Attempt- ing to do a sudo make install will fail since the user does not have root access on the cluster. • On successfull install of the RJMCMC library, the user can then go ahead to compile and install the tomo mpi and tomo code into the <code-binary-directory>. It is from this directory that the SBATCH shell scripts in <run-shell-execute-directory> load the binary executables. Here are the steps to do this: > cd path to <Code-data-directory>/tomo-0.9.16 2 > setenv PKG CONFIG PATH /home/deepthought2/olugboji/lib/pkg-config > module load openmpi > ./configure -RJMCMC FLAGS=-I<rjMcMC-library-directory> -I<$MPI INC> - RJMCMC LIBS=-L<$MPI LIB> -lm -lmpi -lrjmcmc > make > cp tomo* <code-binary-directory>
- Note: if the configure command fails, then just run the command with
empty values for the parameter ags RJMCMC FLAGS and RJMCMC LIBS, and then go into the makefile and update the relevant ags with the speci�ed include and load ags. Also note that the two MPI environment variables: $MPI INC and $MPI LIB are only set after the module load mpi command has been used. These two environment variables are absolutely crucial for compiling the mpi version of the code i.e. tomo mpi.
Build parameter files
Run [{{#filelink: single_job.sh}} run_pythonSbatch.sh] script, which calls the single job sbatch shell [{{#filelink: single_job.sh}} runSingle_PhaseExpt.sh], and the python code bldParamSingle.py
[{{#filelink: single_job.sh}} bldParamSingle.py]
{{#fileanchor: run_pythonSbatch.sh}}
#!/bin/tcsh # run_pythonSbatch.sh # Author: Tolulope Olugboji # Date: March 6, 2015 # # Used to build parameter files, by running them on umd's deepthought2 module load python/2.7.8 # remplace with the following output directory for 3 expts - 1. Expt1_All 2. Expt2_RadialRal 3. Expt3_Seasons set expts = (newExpts/RadialRal/ newExpts/Summer/ newExpts/Winter/) set saveAs = (Expt2_RadialRal/ Expt3_Seasons/Sum/ Expt4_Seasons/Win/) set use4slrm = (expt2 expt3 expt4) set indxExpts = `seq 1 $#expts` # experiment iterator echo "Single in Directory ..." $expts[1] echo $saveAs[1] echo $#expts foreach iExpt ($indxExpts) set inDirX = '/lustre/olugboji/buildParamFiles/USANT15/Measure/' set outDirX = '/lustre/olugboji/buildParamFiles/USANT15/THBIParams/' echo "START !!!!! Expt " $iExpt "-----------------------------------------------------" echo "Input Dir ..." $inDirX$expts[$iExpt] echo "Output Dir " $outDirX$saveAs[$iExpt] set inDir = $inDirX$expts[$iExpt] set outDir = $outDirX$saveAs[$iExpt] set inLove = `ls $inDir` set indxPhase = `seq 1 $#inLove` #set indxPhase = `seq 12 21` echo "all Love" $inLove[1] "length" $#inLove echo $indxPhase foreach iPhase ($indxPhase) set file = $inLove[$iPhase] echo $iPhase " : " $inLove[$iPhase] setenv JOBNAME "$file" setenv INDIR "$inDir" setenv OUTDIR "$outDir" set slurmOut = "/lustre/olugboji/buildParamFiles/USANT15/mpiOUT/slurm-$use4slrm[$iExpt]-$file.txt" #sbatch --job-name=$JOBNAME --time=2-0 --output=$slurmOut --export=JOBNAME ./runSingle_PhaseExpt.sh sbatch --job-name=$JOBNAME --time=2-0 --output=$slurmOut --export=ALL ./runSingle_PhaseExpt.sh end echo "END!!!!! Expt " $iExpt "-----------------------------------------------------" end
{{#fileanchor: runSingle_PhaseExpt.sh}}
#!/bin/tcsh # runSingle_PhaseExpt.sh} # Adjust memory/walltime/ncpus as necessary # # #SBATCH --ntasks=1 #SBATCH --mail-user=tolumorayo@gmail.com #SBATCH --mail-type=ALL #SBATCH -A ved-prj-hi #SBATCH --share module load python/2.7.8 echo "*********SBATCH called on ... $JOBNAME $INDIR $OUTDIR" ipython bldParamSingle.py $JOBNAME $INDIR $OUTDIR
{{#fileanchor: bldParamSingle.py}}
# coding: utf-8 # bldParamSingle.py # !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! # Author: Tolulope Olugboji # Date: Nov. 17, 2014 | Original: - | updated July, 17 # !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! # # Objective: Parse Ekstrom USarry phase velocity data: # construct files used by transdimensional tomography code tomo- Bodin etal # # sources.txt - file containing source lat and lon locations # receivers.txt - fle containing receiver lat and lon locations # paths.txt - file containing path vectors... # # use module load python/2.7.8 on deepthought2 import numpy as np import math from obspy.core.util.geodetics import gps2DistAzimuth from geographiclib.geodesic import Geodesic from os import listdir from os.path import isfile, join import os import sys # parse directory and read sort files into Love and Raleigh txt files def parseDir(inDir): outLove = [] outRal = [] onlyfiles = [ f for f in listdir(inDir) if isfile(join(inDir,f)) ] #print onlyfiles for curFile in onlyfiles: phaseCode = curFile[0:1] if phaseCode == 'L': outLove.append(curFile) if phaseCode == 'R': outRal.append(curFile) return outLove, outRal # Check if Directory exists, if it does return, otherwise create dir def mkDirIfNE(inPath): if not os.path.exists(inPath): print "making directory .." os.makedirs(inPath) else: print "Directory already exists" # function that returns nPoints on great circle path between locA(lat, lon) and locB(lat2, lon2) def returnPtsOnGCPath(locA, locB, nPoints=2, delta=0.2): EARTH_R = 6371.000000 # Earth Radius # Coordinates of airports lat1, lon1 = locA[0], locA[1] # source lat2, lon2 = locB[0], locB[1] # receiver # Compute path information from locA to locB p = Geodesic.WGS84.Inverse(lat1, lon1, lat2, lon2) # define line information l=Geodesic.WGS84.Line(p['lat1'],p['lon1'],p['azi1']) # Compute midpoint starting at 1 dlon = (lon2 - lon1) dlat = (lat2 - lat1) n = int(math.ceil(math.sqrt(dlon*dlon + dlat*dlat)/delta)) if ( n < nPoints ): nPoints = 1 else: nPoints = n pntsOnPath = [] for i in range(nPoints+1): b=l.Position(i*p['s12']/nPoints) pntsOnPath.append([ b['lon2'], b['lat2'], EARTH_R ]) return np.asarray(pntsOnPath), nPoints+1 #Define Data Directories -- Ekstrom phase velocity files... UPDATE ! POINT TO USANT15 # see if arguments where passed successfully ... print "!!!!!!!!!!! Called Build with argument !!!!!!!: ", sys.argv[1] #RalFile = inRal[iFile] RalFile = sys.argv[1] inDir = sys.argv[2] outDir = sys.argv[3] nxtDir = RalFile[0:3] mkDirIfNE(outDir + nxtDir) ### Out Data File for transdimensional codee outLoc = outDir+nxtDir+'/sources.dat' outLocRec = outDir+nxtDir+'/receivers.dat' outPaths = outDir+nxtDir+'/paths.dat' outObsv = outDir+nxtDir+'/observations.dat' outCodes = outDir+nxtDir+'/sourcesAnot.dat' #### Read columnns of the data file fileLoc = inDir + RalFile staCodes = np.loadtxt(fileLoc, dtype='str', usecols= (0,)) recCodes = np.loadtxt(fileLoc, dtype='str', usecols= (1,)) staLocs = np.loadtxt(fileLoc, usecols= (2,3)) recLocs = np.loadtxt(fileLoc, usecols=(4,5)) #### Load observational data pathDist = np.loadtxt(fileLoc, usecols= (6,)) # distance in km along source-receiver path pathVel = np.loadtxt(fileLoc, usecols= (10,)) # phase velocity in km/sec along source-receiver path pathDV = np.loadtxt(fileLoc, usecols= (11,)) # phase velocity error in sec along source-receiver path allStaCode = len(staCodes) allRecCode = len(recCodes) #print "statAll:, ", allStaCode, allRecCode, staCodes[1], staLocs[1] staSet = list(set(staCodes)) recSet = list(set(recCodes)) uniqLen = len(staSet) uniqLenRec = len(recSet) print "Unique Set: ", uniqLen, uniqLenRec, staSet[0:3], recSet[0:3] storeStatCode = [] storeStaLoc = [] for indxUniq in range(uniqLen): #print staSet[indxUniq] for lookUp in range(allStaCode): if (staSet[indxUniq] == staCodes[lookUp]): #print staCodes[lookUp], staLocs[lookUp] storeStatCode.append(staCodes[lookUp]) storeStaLoc.append(staLocs[lookUp]) break print "After Parse: ", len(storeStatCode), len(storeStaLoc) print "After Parse: ", storeStatCode[0:3], storeStaLoc[0:3] headVal = len(storeStaLoc) # save station and receiver locations [stations double as receivers in ambient noise tomography].. # # - uncomment below to save arrays to file np.savetxt(outLoc, np.fliplr(storeStaLoc), header= str(headVal), fmt='%.3f', comments='') np.savetxt(outLocRec, np.fliplr(storeStaLoc), header= str(headVal), fmt='%.3f', comments='') # Save all possible paths and nPointsOnPath = 2 lenObserv = allStaCode searchLen = lenObserv lenPaths = len(storeStaLoc)*len(storeStaLoc) countPath = 0 countObsvPath = 0 staCodeList = staCodes.tolist() recCodeList = recCodes.tolist() listPathDist = pathDist.tolist() listPathVel = pathVel.tolist() listPathDV = pathDV.tolist() print lenPaths observTable = [] with file(outPaths, 'w') as outFilePath: outFilePath.write( '%d \n' % lenPaths ) obsRow = [0, 0.0, 0.0] for staCode, staLoc in zip(storeStatCode, storeStaLoc): for recCode, recLoc in zip(storeStatCode, storeStaLoc): valProgress = float(countPath)/float(lenPaths) * 100.0 #print "%0.5f " %valProgress, "%... done (", countPath, "of", lenPaths , ")" obsRow = [0, 0.0, 0.0] if staCode == recCode: #print 'self Path' nPoints= 1 nPointsOnPath = 2 path, nPointsOnPath = returnPtsOnGCPath(staLoc, staLoc,1, delta=0.2) outFilePath.write( '%d \n' % nPointsOnPath ) np.savetxt(outFilePath, path, fmt='%.3f') else: #scanConnectingPath(staCode, recCode) #print "Look-Up", staCode, recCode, staLoc, recLoc path,nPointsOnPath = returnPtsOnGCPath(staLoc, recLoc,2, delta=0.2) nPoints= nPointsOnPath outFilePath.write( '%d \n' % nPoints ) np.savetxt(outFilePath, path, fmt='%.3f') for iObserv in range(len(staCodeList)): pathSta = staCodeList[iObserv] pathRec = recCodeList[iObserv] #for pathSta, pathRec in zip(staCodes, recCodes): #print "Searching: ", pathSta, pathRec if (staCode == pathSta and recCode == pathRec) or (staCode == pathRec and recCode == pathSta): #timeObs = float(pathDist[iObserv]) / float(pathVel[iObserv]) timeObs = listPathDist[iObserv] / listPathVel[iObserv] #delObs = (timeObs / float(pathVel[iObserv]) ) * float(pathDV[iObserv]) delObs = (timeObs / listPathVel[iObserv] ) * listPathDV[iObserv] obsRow = [1, timeObs, delObs] countObsvPath = countObsvPath + 1 print "%0.5f " %valProgress, "%... done (", countPath, "of", lenPaths , ")" print "Found: ", pathSta, pathRec # Delete search index to make search smaller and tractable del staCodeList[iObserv] del recCodeList[iObserv] del listPathDist[iObserv] del listPathVel[iObserv] del listPathDV[iObserv] # End delete of search list .... ### If path has valid observation then put code 1 in observation file, otherwise 0. #searchLen = lenObserv - 1; break observTable.append(obsRow) countPath = countPath + 1 #clear_output(wait=True) print "Done: Total Valid Paths = ", countObsvPath np.savetxt(outObsv, observTable, fmt='%d %.3f %.3f')
Setting up parameter and data files
Now that the libraries and executable (binaries) environment is set up, the next stage is to set up the input parameters and the data environment necessary to stage the code on the cluster. There are two directories here:
The First is the <Input/Output-Data/Exhaust-directory>
that stores the input data, parameter files needed to start the tomompi</code? code and results output generated after successful completion of the analysis. The datasets are organized by phase-velocity e.g.
<Input/Output-Data/Exhaust-directory>/<folder>
where <folder>
represents the 22 phase velocity datasets: L05-L40 and R05-R40. Here is a listing of the data directory:
<Input/Output-Data/Exhaust-directory>/<folder>
Holds the important observation data, source-receiver configuration and path location files e.g. observations.dat, paths.dat, sources.dat, etc.
<Input/Output-Data/Exhaust-directory>/<folder>/results
Stores the saved output after completed run e.g. output mean, Voronoi-partition location and velocity history, misfit history, model save state, etc.)
<Input/Output-Data/Exhaust-directory>/<folder>/restart
Stores the model states (i.e. high resolution tesselated model states or intermediate model states) to be used in the case of a restart-able chain.
The second is the <run-shell-execute-directory> where the shell SBATCH
scripts for staging the code on the cluster Here are the files in this directory:
<run-shell-execute-directory>/tomo sbatch.sh
:
Script runs tomo mpi for all 22 phase velocity maps. You can tweak this script to run single maps, specify an initial number of partitions, or to determine if you want to restart chains from set stages or particular input models. This is actually a wrapper script to tomo single.sh that does all the heavy lifting.
<run-shell-execute-directory>/tomo single.sh
:
Script used by tomo sbatch.sh above for running the tomo mpi on single
phase velocity maps, it also sets up all the required environment variables,
determines how many chains to run and sets up the required
ags.
<run-shell-execute-directory>/tomoParamMPI.nml:
Parameter file used by tomo single.sh for setting up all the required input parameters used by tomo mpi. It sets up parameters like file paths, number of steps, distribution range, perturbation parameters, seed values for probability distribution, state of starting models etc. Note that we provide another parameter file: <run-shell-execute-directory>/rstrtTomoParamMPI.nml
for setting up parameters if the chain is to be restart from a specified initial model state (typically a high-resolution near-mean state).
<run-shell-execute-directory>/dwnldRslts.sh:
Script used to download results data from the remote deepthought2 cluster machine to the <outputData-prjDir> on the local machine for post-processing and visualization.