Skip to content

post.xcat_restructure

penguhyang edited this page Dec 1, 2015 · 38 revisions

The mini-design of post.xcat restructure

Background

As the code logic in original post.xcat file has some problem. We should identify some critical error and plain error for easy to debug. When error happens, we should record the detail information on MN and the node.

critical error

Solution: write this error information on the node and MN, halt the system.

1. openssl is not installed on the system
2. download the postscripts failure
    We use wget command to download the postscripts from the http://$i$INSTALLDIR/postscripts/ on MN, it maybe failure for a serial reasons.
    1) Without wget command
    2) The network is unreachable
3. getpostscript.awk not exist
    First we try to download the mypostscript.$NODE file from the MN, we will rename it to mypostscript if MN have this file. If MN don't have this file, we will try to create mypostscript file using getpostscript.awk. If the getpostscript.awk file is not in the /xcatpost folder, then the error happens.
4. create the mypostscript failure
    The mypostscript file is used to generate the mypostscript.post and other files. If this file can't generate with these two methods, then the error happens. 

plain error

Solution: write this error information on the node and MN, but not halt the system.

1. download the precreate mypostscript file failure
2. create the mypostscript.post file failure
3. create the xcatpostinit1 file failure
4. create the xcatinstallpost file failure
5. create the xcatdsklspost file failure
6. create the mypostscript file failure

Code Logic and Process

  1. Export environment variable information, such as MASTER_IP, NODESTATUS, TFTPDIR and etc..
  2. Include the library of the xCAT to use some functions.
  3. Set the value for the variable:INSTALLDIR, TFTPDIR if they haven't set.
  4. Sleep for a while, then download the postscripts from management node.
  5. Before download postscripts form management node, exam whether the openssl is installed or not, if not then the system should halt.
  6. Time to download postscripts, use wget command the postscripts from MN and create a variable GOTALL as a flag to show whether the download is sucessfully, if not then the sytem should halt.
  7. Fortunately the postscripts have been downloaded sucessfully, then we will create the mypostscript file.
  8. First try to download the mypostscript.$NODE file, this file is created when set the precreatemypostscripts attribute to 1. If this file exists, rename this file to mypostscript.
  9. If there is not mypostscript.$NODE file, then we should generate mypostscript file through getpostscript.awk. If the getpostscript.awk file not exist, then the system should halt.
  10. We use a while loop to generate mypostscript with getpostscript.awk in case there is a failure.
  11. Use sed command to add run_ps before the commands in the mypostscript file. We output the run_ps subroutine and append the mypostscript file content to recreate mypostscript file. Unfortunately, this file can't be created, so the system will halt.
  12. Now we have the mypostscript file. It's time to use the mypostscript file to create the mypostscript.post file according sed command to delete the items between postscripts-start-here and postscripts-end-here
  13. Create the post init file(xcatpostinit1)
  14. Create the xcatinstallpost file
  15. Create the dskls post file(xcatdsklspost)
  16. Finally create the mypostscript file according sed command to delete the items between postbootscripts-start-here and postbootscripts-end-here
  17. update the node status using updateflag.awk

Planning Outputs

When xcatdebugmode is on, the log information will be saved.

  1. The system will sleep for a while to get ready, the output will looks like.

    sleep 16

  2. Before download postscripts from the management node, exam whether the openssl is installed or not, if not the output will looks like.

    /usr/bin/openssl does not exist, hang ...

News

History

  • Oct 22, 2010: xCAT 2.5 released.
  • Apr 30, 2010: xCAT 2.4 is released.
  • Oct 31, 2009: xCAT 2.3 released. xCAT's 10 year anniversary!
  • Apr 16, 2009: xCAT 2.2 released.
  • Oct 31, 2008: xCAT 2.1 released.
  • Sep 12, 2008: Support for xCAT 2 can now be purchased!
  • June 9, 2008: xCAT breaths life into (at the time) the fastest supercomputer on the planet
  • May 30, 2008: xCAT 2.0 for Linux officially released!
  • Oct 31, 2007: IBM open sources xCAT 2.0 to allow collaboration among all of the xCAT users.
  • Oct 31, 1999: xCAT 1.0 is born!
    xCAT started out as a project in IBM developed by Egan Ford. It was quickly adopted by customers and IBM manufacturing sites to rapidly deploy clusters.
Clone this wiki locally