MediaWiki API result

This is the HTML representation of the JSON format. HTML is good for debugging, but is unsuitable for application use.

Specify the format parameter to change the output format. To see the non-HTML representation of the JSON format, set format=json.

See the complete documentation, or the API help for more information.

{
    "batchcomplete": "",
    "continue": {
        "gapcontinue": "Training_Videos",
        "continue": "gapcontinue||"
    },
    "warnings": {
        "main": {
            "*": "Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/postorius/lists/mediawiki-api-announce.lists.wikimedia.org/> for notice of API deprecations and breaking changes."
        },
        "revisions": {
            "*": "Because \"rvslots\" was not specified, a legacy format has been used for the output. This format is deprecated, and in the future the new format will always be used."
        }
    },
    "query": {
        "pages": {
            "30": {
                "pageid": 30,
                "ns": 0,
                "title": "SlurmBasics",
                "revisions": [
                    {
                        "contentformat": "text/x-wiki",
                        "contentmodel": "wikitext",
                        "*": "== The CentOS/Slurm nodes ==\n\nWe have converted Beocat from Gentoo Linux to CentOS Linux on December 26th of 2017.  Any applications or libraries from the old system must be recompiled.  We also converted Beocat to use the Slurm scheduler instead of SGE.  You will therefore also need to convert all your old qsub scripts over to sbatch scripts.  We have developed tools to make this process as easy as possible. If you're needing to convert old SGE scripts to SLURM, please checkout [[kstat.convert]] \n\n=== Using Modules ===\n\nIf you're using a common code that others may also be using, we may already have it compiled in a module.  You can list the modules available and load an application as in the example below for Vasp.\n\n eos>  <B>module avail</B>\n eos>  <B>module load VASP</B>\n eos>  <B>module list</B>\n\nWhen a module gets loaded, all the necessary libraries are also loaded and the paths to the libraries and executables are automatically set up.  Loading Vasp for example also loads the OpenMPI library needed to run it and adds the path to the MPI commands and Vasp executables.   To see how the path is set up, try executing <B><I>which vasp_std</I></B>.  The module system allows you to easily switch between different version of applications, libraries, or languages as well.\n\nIf you are using a custom code or one that is not installed in a module, you'll need to recompile it yourself.  This process is easier under CentOS as some of the work just involves loading the necessary set of modules.  The first step is to decide whether to use the Intel compiler toolchain or the GNU toolchain, each of which includes the compilers and other math libraries.  The module commands for each are below, and you can load these automatically when you log in by adding one of these module load statements to your .bashrc file.  See <B>/homes/daveturner/.bashrc</B> as an example, where I put the module load statements .\n\nTo load the Intel compiler tool chain including the Intel Math Kernel Library (and OpenMPI):\n eos>  <B>module load iomkl</B>\n\nTo load the GNU compiler tool chain including OpenMPI, OpenBLAS, FFTW, and ScalaPack load foss (free open source software):\n eos>  <B>module load foss</B>\n\nModules provide an easy way to set up the compilers and libraries you may need to compile your code.  Beyond that there are many different ways to compile codes so you'll just need to follow the directions.  If you need help you can always email us at <B>beocat@cs.ksu.edu</B>.\n\n=== Submitting jobs to Slurm ===\n\nOnce your qsub script has been converted to an sbatch script and you have an application compiled for CentOS, you can submit the job using the <B>sbatch</B> command.\n\n eos> <B>sbatch sbatch_script.sh</B>\n eos> <B>kstat  --me</B>\n\nThis will submit the script and show you a list of your jobs that are running and the jobs you have in the queue.  By default the output for each job will go into a <B>slurm-###.out</B> file where ### is the job ID number.  If you need to kill a job, you can use the <B>scancel</B> command with the job ID number.\n\n== Submitting your first job ==\nTo submit a job to run under Slurm, we use the <B><I>sbatch</I></B> (submit batch) command.  The scheduler finds the optimum place for your job to run. With over 300 nodes and 7500 cores to schedule, as well as differing priorities, hardware, and individual resources, the scheduler's job is not trivial and it can take some time for a job to start even when there are empty nodes available.\n\nThere are a few things you'll need to know before running sbatch.\n* How many cores you need. Note that unless your program is created to use multiple cores (called \"threading\"), asking for more cores will not speed up your job. This is a common misperception. '''Beocat will not magically make your program use multiple cores!''' For this reason the default is 1 core.\n* How much time you need. Many users when beginning to use Beocat neglect to specify a time requirement. The default is one hour, and we get asked why their job died after one hour. We usually point them to the [[FAQ]].\n* How much memory you need. The default is 1 GB. If your job uses significantly more than you ask, your job will be killed off.\n* Any advanced options. See the [[AdvancedSlurm]] page for these requests. For our basic examples here, we will ignore these.\n\nSo let's now create a small script to test our ability to submit jobs. Create the following file (either by copying it to Beocat or by editing a text file and we'll name it <code>myhost.sh</code>. Both of these methods are documented on our [[LinuxBasics]] page.\n<syntaxhighlight lang=\"bash\" line>\n#!/bin/sh\nhostname\n</syntaxhighlight>\n\nBe sure to make it executable\n chmod u+x myhost.sh\n\nSo, now lets submit it as a job and see what happens. Here I'm going to use five options\n* <code>--mem-per-cpu=</code> tells how much memory I need. In my example, I'm using our system minimum of 512 MB, which is more than enough. Note that your memory request is '''per core''', which doesn't make much difference for this example, but will as you submit more complex jobs.\n* <code>--time=</code> tells how much runtime I need. This can be in the form of \"minutes\", \"minutes:seconds\", \"hours:minutes:seconds\", \"days-hours\", \"days-hours:minutes\" and \"days-hours:minutes:seconds\". This is a very short job, so 1 minute should be plenty. This can't be changed after the job is started please make sure you have requested a sufficient amount of time.\n* <code>--cpus-per-task=1</code> tells Slurm that I need only a single core per task. The [[AdvancedSlurm]] page has much more on the \"cpus-per-task\" switch.\n* <code>--ntasks=1</code> tells Slurm that I only need to run 1 task. The [[AdvancedSlurm]] page has much more on the \"ntasks\" switch.\n* <code>--nodes=1</code> tells Slurm that this must be run on one machine. The [[AdvancedSlurm]] page has much more on the \"nodes\" switch.\n* <code>--nodes=4 --ntasks-per-node=16 --constraint=elves</code> requests 4 nodes with 16 cores on each and to only use the Elves.\n\n % '''ls'''\n myhost.sh\n % '''sbatch --time=1 --mem-per-cpu=512M --cpus-per-task=1 --ntasks=1 --nodes=1 ./myhost.sh'''\n salloc: Granted job allocation 1483446\n\nSince this is such a small job, it is likely to be scheduled almost immediately, so a minute or so later, I now see\n % '''ls'''\n myhost.sh\n slurm-1483446.out\n\n % '''cat slurm-1483446.out'''\n mage03\n\n== Monitoring Your Job ==\n\nThe <B>kstat</B> perl script has been developed at K-State to provide you with all the available information about your jobs on Beocat.  <B>kstat --help</B> will give you a full description of how to use it.\n\n Eos>  kstat --help\n  \n  USAGE: kstat [-q] [-c] [-g] [-l] [-u user] [-p NaMD] [-j 1234567] [--part partition]\n         kstat alone dumps all info except for the core summaries\n         choose -q -c for only specific info on queued or core summaries.\n         then specify any searchables for the user, program name, or job id\n  \n  kstat                 info on running and queued jobs\n  kstat -h              list host info only, no jobs\n  kstat -q              info on the queued jobs only\n  kstat -c              core usage for each user\n  kstat -d #            show jobs run in the last # days\n                        Memory per node - used/allocated/requested\n                        Red is close to or over requested amount\n                        Yellow is under utilized for large jobs\n  kstat -g              Only show GPU nodes\n  kstat -o Turner       Only show info for a given owner\n  kstat -o CS_HPC          Same but sub _ for spaces\n  kstat -l              long list - node features and performance\n                        Node hardware and node CPU usage\n                        job nodelist and switchlist\n                        job current and max memory\n                        job CPU utilizations\n  kstat -u daveturner   job info for one user only\n  kstat --me            job info for my jobs only\n  kstat -j 1234567      info on a given job id\n  kstat --osg           show OSG background jobs also\n  kstat --nocolor       do not use any color\n  kstat --name          display full names instead of eIDs\n  \n  ---------------- Graphs and Tables ---------------------------------------\n  Specify graph/table,  CPU or GPU or host, usage or memory, and optional time\n  kstat --graph-cpu-memory #      gnuplot CPU memory for job #\n  kstat --table-gpu-usage-5min #  GPU usage table every 5 min for job #\n  kstat --table-cpu-60min #       CPU usage, memory, swap table every 60 min for job #\n  kstat --table-node [nodename]   cores, load, CPU usage, memory table for a node\n  \n  --------------------------------------------------------------------------\n    Multi-node jobs are highlighted in Magenta\n       kstat -l also provides a node list and switch list\n       highlighted in Yellow when nodes are spread across multiple switches\n    Run time is colorized yellow then red for jobs nearing their time limit\n    Queue time is colorized yellow then red for jobs waiting longer times\n  --------------------------------------------------------------------------\n\nkstat can be used to give you a summary of your jobs that are running and in the queue:\n <B>Eos>  kstat --me</B>\n\n<b>\n<font color=Brown>Hero43 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=Blue>24 of 24 cores &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=black>Load 23.4 / 24 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=Red>495.3 / 512 GB used</font><br>\n&nbsp;&nbsp;&nbsp;&nbsp;\n<font color=lightgreen>daveturner &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=black>unafold &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1234567 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=cyan>1 core &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=green>running &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=black> 4gb req &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=black> 0 d  5 h 35 m </font><br>\n&nbsp;&nbsp;&nbsp;&nbsp;\n<font color=green>daveturner &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=black>octopus &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1234568 &nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=cyan>16 core &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=green>running &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=red> 128gb req &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=black> 8 d 15 h 42 m </font><br>\n<font color=green> ##################################   BeoCat Queue    ################################### </font><br>\n&nbsp;&nbsp;&nbsp;&nbsp;\n<font color=green>daveturner &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=black>NetPIPE &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1234569 &nbsp;&nbsp;&nbsp;&nbsp; </font>\n<font color=cyan>2 core &nbsp;&nbsp;&nbsp;</font>\n<font color=red> PD &nbsp;</font>\n<font color=black> 2h &nbsp;</font>\n<font color=black> 4gb req &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</font>\n<font color=black> 0 d 1 h 2 m </font><br>\n</b>\n\n<b>kstat</b> produces a separate line for each host.  Use <b>kstat -h</b> to see information on all hosts without the jobs.\nFor the example above we are listing our jobs and the hosts they are on.\n\nCore usage - yellow for empty, red for empty on owned nodes, cyan for partially used, blue for all cores used.<BR>\nLoad level - yellow or yellow background indicates the node is being inefficiently used.  Red just means more threads than cores.<br>\nMemory usage - yellow or red means most memory is used.<BR>\nIf the node is owned the group name will be in orange on the right.  Killable jobs can still be run on those nodes.<BR>\n\nEach job line will contain the username, program name, job ID, number of cores, the status which may be colored red for killable jobs, \nthe maximum memory used or memory requested, and the amount of time the job has run.  \nJobs in the queue may contain information on the requested memory and run time, priority access, constraints, and\nhow long the job has been in the queue.\nIn this case, I have 2 jobs running on Hero43.  <i>unafold</i> is using 1 core while <i>octopus</i> is using 16 cores.  Slurm did not provide\nany information on the actual memory use so the memory request is reported  \n\n=== Detailed information about a single job ===\n\nkstat can provide a great deal of information on a particular job including a very rough estimate of when it will run.  This time is a worst case scenario as this will\nbe adapted as other jobs finish early.  This is a good way to check for job submission problems before contacting us.  kstat colorizes the more important\ninformation to make it easier to identify.\n\n Eos>  kstat -j 157054\n \n ##################################   Beocat Queue    ###################################\n  daveturner  netpipe     157054   64 cores  PD       dwarves fabric  CS HPC     8gb req   0 d  0 h  0 m\n \n JobId 157054  Job Name  netpipe\n   UserId=daveturner GroupId=daveturner_users(2117) MCS_label=N/A\n   Priority=11112 Nice=0 Account=ksu-cis-hpc QOS=normal\n   Status=PENDING Reason=Resources Dependency=(null)\n   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0\n   RunTime=00:00:00 TimeLimit=00:40:00 TimeMin=N/A\n   SubmitTime=2018-02-02T18:18:31 EligibleTime=2018-02-02T18:18:31\n   Estimated Start Time is 2018-02-03T06:17:49 EndTime=2018-02-03T06:57:49 Deadline=N/A\n   PreemptTime=None SuspendTime=None SecsPreSuspend=0\n   Partitions killable.q,ksu-cis-hpc.q AllocNode:Sid=eos:1761\n   ReqNodeList=(null) ExcNodeList=(null)\n   NodeList=(null) SchedNodeList=dwarf[01-02]\n   NumNodes=2-2 NumCPUs=64 NumTasks=64 CPUs/Task=1 ReqB:S:C:T=0:0:*:*\n   TRES 2 nodes 64 cores 8192  mem gres/fabric 2\n   Socks/Node=* NtasksPerN:B:S:C=32:0:*:* CoreSpec=*\n   MinCPUsNode=32 MinMemoryNode=4G MinTmpDiskNode=0\n   Constraint=dwarves DelayBoot=00:00:00\n   Gres=fabric Reservation=(null)\n   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)\n   Slurm script  /homes/daveturner/perf/NetPIPE-5.x/sb.np\n   WorkDir=/homes/daveturner/perf/NetPIPE-5.x\n   StdErr=/homes/daveturner/perf/NetPIPE-5.x/0.o157054\n   StdIn=/dev/null\n   StdOut=/homes/daveturner/perf/NetPIPE-5.x/0.o157054\n   Switches=1@00:05:00\n<syntaxhighlight lang=bash>\n#!/bin/bash -l\n#SBATCH --job-name=netpipe\n#SBATCH -o 0.o%j\n#SBATCH --time=0:40:00\n#SBATCH --mem=4G\n#SBATCH --switches=1\n#SBATCH --nodes=2\n#SBATCH --constraint=dwarves\n#SBATCH --ntasks-per-node=32\n#SBATCH --gres=fabric:roce:1\n\nhost=`echo $SLURM_JOB_NODELIST | sed s/[^a-z0-9]/\\ /g | cut -f 1 -d ' '`\nnprocs=$SLURM_NTASKS\nopenmpi_hostfile.pl $SLURM_JOB_NODELIST 1 hf.$host\nopts=\"--printhostnames --quick --pert 3\"\n\necho \"*******************************************************************\"\necho \"Running on $SLURM_NNODES nodes $nprocs cores on nodes $SLURM_JOB_NODELIST\"\necho \"*******************************************************************\"\n\nmpirun -np 2 --hostfile hf.$host NPmpi $opts -o np.${host}.mpi\nmpirun -np 2 --hostfile hf.$host NPmpi $opts -o np.${host}.mpi.bi --async --bidir\nmpirun -np $nprocs NPmpi $opts -o np.${host}.mpi$nprocs --async --bidir\n</syntaxhighlight>\n\n=== Completed jobs and memory usage ===\n\n kstat -d #\n\nThis will provide information on the jobs you have currently running and those that have completed\nin the last '#' days.  This is currently the only reliable way to get the memory used per node for your job.\nThis also provides information on whether the job completed normally, was canceled with <I>scancel</I>, \ntimed out, or was killed because it exceeded its memory request.\n\n Eos>  kstat -d 10\n\n ###########################  sacct -u daveturner  for 10 days  ###########################\n                                      max gb used on a node /   gb requested per node\n  193037   ADF         dwarf43           1 n  32 c   30.46gb/100gb    05:15:34  COMPLETED\n  193289   ADF         dwarf33           1 n  32 c   26.42gb/100gb    00:50:43  CANCELLED\n  195171   ADF         dwarf44           1 n  32 c   56.81gb/120gb    14:43:35  COMPLETED\n  209518   matlab      dwarf36           1 n   1 c    0.00gb/  4gb    00:00:02  FAILED\n\n=== Summary of core usage ===\n\nkstat can also provide a listing of the core usage and cores requested for each user.\n Eos>  kstat -c\n \n ##############################   Core usage    ###############################\n   antariksh       1512 cores   %25.1 used     41528 cores queued\n   bahadori         432 cores   % 7.2 used        80 cores queued\n   eegoetz            0 cores   % 0.0 used         2 cores queued\n   fahrialkan        24 cores   % 0.4 used        32 cores queued\n   gowri             66 cores   % 1.1 used        32 cores queued\n   jeffcomer        160 cores   % 2.7 used         0 cores queued\n   ldcoates12        80 cores   % 1.3 used       112 cores queued\n   lukesteg         464 cores   % 7.7 used         0 cores queued\n   mike5454        1060 cores   %17.6 used       852 cores queued\n   nilusha          344 cores   % 5.7 used         0 cores queued\n   nnshan2014       136 cores   % 2.3 used         0 cores queued\n   ploetz           264 cores   % 4.4 used        60 cores queued\n   sadish           812 cores   %13.5 used         0 cores queued\n   sandung           72 cores   % 1.2 used        56 cores queued\n   zhiguang          80 cores   % 1.3 used       688 cores queued\n\n=== Producing memory and CPU utilization tables and graphs ===\n\nkstat can now produce tables or graphs for the memory or CPU utilization\nfor a job.  In order to view graphs you must set up X11 forwarding on your\nssh connection by using the -X parameter.\n\nIf you want to read more, continue on to our [[AdvancedSlurm]] page."
                    }
                ]
            },
            "18": {
                "pageid": 18,
                "ns": 0,
                "title": "Tips and Tricks",
                "revisions": [
                    {
                        "contentformat": "text/x-wiki",
                        "contentmodel": "wikitext",
                        "*": "Beocat has a number of tools to make your work easier, some which you may not know about. This is a simple list of these programs and some basic usage scenarios.\n\n== Submitting your job to run the fastest ==\n=== Size your jobs to use the fastest nodes ===\n==== Specify the proper number of cores ====\nBeocat (nor any other computer or cluster) can make your job run on more than one core at a time if your program isn't designed to take advantage of this. Many people think \"I can run this on 40 cores and it will run 40 times faster\". This isn't true.\n\nWhile we have many programs that are designed to take advantage of multiple cores, do not assume this is the case\n\n==== Optimize your jobs for speed, not for number of cores ====\nIt seems that many people pick an arbitrary large number of cores for their jobs. 20 seems to be a common one. However, some of our fastest nodes have 16 cores. It's quite likely if your job will fit on an Elf (16 cores, 8 GB/RAM/core (64 GB RAM total)), it will run faster with 16 cores than by specifying more cores and having it run on slower nodes.\n\n==== Don't request resources you don't need ====\nThe most common culprit here is people specifying they need infiniband when the job is run on a single node. This limits the scheduling such that a perfectly good node for your job may be idle while your job is still waiting.\n\n== Programs that make using Beocat easier ==\n=== [[wikipedia:nmon|nmon]] ===\nThe name is short for \"Nigel's Monitor\", it's a program written by Nigel Griffiths from IBM.\n=== [http://www.ibm.com/developerworks/aix/library/au-nmon_analyser/ nmon analyser] ===\nA tool for producing graphs and spreadsheets from output generated by nmon.\n=== [http://hisham.hm/htop/ htop] ===\nA prettier, easier to use top. Shows CPU and memory usage in an easy-to-digest format.\n=== [http://www.gnu.org/software/screen/ screen] ===\nA sort of terminal multiplexer, allows you to run many terminal programs at once without mixing them up. Also allows you to disconnect and reconnect sessions. There is a good explanation of how to use screen at [http://www.mattcutts.com/blog/a-quick-tutorial-on-screen/ http://www.mattcutts.com/blog/a-quick-tutorial-on-screen/].\n=== Ganglia ===\nThe web-based load monitoring tool for the cluster. [http://ganglia.beocat.cis.ksu.edu http://ganglia.beocat.ksu.edu] . From there, you can see how busy Beocat is.\n=== [http://dag.wieers.com/home-made/dstat/ dstat] ===\nA very detailed performance analyzer.\n\n== Increasing file write performance ==\nCredit for this goes to [http://moo.nac.uci.edu/~hjm/bduc/BDUC_USER_HOWTO.html#writeperfongl http://moo.nac.uci.edu/~hjm/bduc/BDUC_USER_HOWTO.html#writeperfongl]\n\n=== Use gzip ===\nIf you have written your own code or are using an app that writes zillions of tiny chunks of data to STDOUT, and you are storing the results on Beocat, you should consider passing the output thru gzip to consolidate the writes into a continuous stream. If you don\u2019t do this, each write will be considered a separate IO event and the write performance will suffer.\n\nIf, however, the STDOUT is passed thru gzip, the wallclock runtime decreases even below the usual runtime and you end up with an output file that it already compressed to about 1/5 the usual size.\n\nThe here\u2019s how to do it:\n\n someapp --opt1 --opt2 --input=/path/to/input_file | gzip > /path/to/output_file\n=== Use named pipes ===\nNamed pipes are special files that don't actually write to the filesystem, and can be used to communicate between processess. Since these pipes are in memory rather than directly to disk, they can be used to buffer writes:\n\n<syntaxhighlight lang=\"bash\">\n# Create the named pipe\nmkfifo /path/to/MyNamedPipe\n\n# Write some data to it\nMyProgram --infile=/path/to/InputData1 --outfile=/path/to/MyNamedPipe &\nMyOtherProgram < /path/to/InputData2 > /path/to/MyNamedPipe\n\n# Extract the output\ncat < /path/to/MyNamedPipe > $HOME/MyOutput\n## OR, we could compress the output\ngzip < /path/to/MyNamedPipe > $HOME/MyOutput.gz\n\n# Delete the named pipe like you would a file\nrm /path/to/MyNamedPipe\n</syntaxhighlight>\nOne cautionary word. Unlike normal files, named pipes cannot be used between machines, but can be used among processes running on the same machine. So, if you're running an MPI job that will be running completely on one node, you could setup a named pipe and do all your writes to that pipe, and then flush it at the end, but if you're running a multi-node MPI job and your named pipe is on a shared filesystem (like $HOME), each process will need to flush its named pipe to a regular file before the job quits.\n=== Use one big file instead of many small ones ===\nThis may seem to be a non-issue, but it's a performance problem we've seen on Beocat many times. I love the term coined by UCI at the link above. They call making many small files \"Zillions Of Tiny files (ZOTfiles)\". Using files like this is an inefficient use of our shared resources. A tiny file by itself is no more inefficient than a huge one. If you have only 100bytes to store, store it in single file. However, the problems start compounding when there are many of them. Because of the way data is stored on disk, 10 MB stored in ZOTfiles of 100bytes each can easily take up NOT 10MB, but more than 400MB - 40 times more space. Worse, data stored in this manner makes many operations very slow - instead of looking up 1 directory entry, the OS has to look up 100,000. This means 100,000 times more disk head movement, with a concommittent decrease in performance and disk lifetime. We have had Beocat users with several million files of less than 1kB each. Just creating a directory listing with ls would take nearly 1/2 hour. Not only is that inefficient for you, but it also degrades the performance of everybody using that filesystem and degrades our backups as well.\n\nPlease use large files instead of ZOTfiles any chance you can!\n\nAs a defense against too much abuse of tiny files, there is a limit of 100,000 entries in any directory in our shared filesystem space."
                    }
                ]
            }
        }
    }
}