1 de octubre de 2008

ps para monitorear activamente los procesos

El comando ps, como casi todo el mundo sabe, sirve para devolver el estado de uno o varios procesos en un momento determinado. Supongo que algunos de nosotros hemos usado en Linux el famoso "ps aux", o su equivalente en Solaris "ps -fea".

Por motivos de trabajo me vi en la necesidad de monitorear procesos que consumían tiempo de procesador e incluso memoria virtual. Afortunadamente me encontré con que exíste la opción -o que permite especificar el formato de salida, es decir, los datos que queremos que aparezcan.

Las opciones que hallé son las siguientes (tomadas de la página manual de ps de Solaris)

user The effective user ID of the process. This will be the textual user ID, if it can be obtained and the field width permits, or a decimal representation otherwise.

ruser The real user ID of the process. This will be the textual user ID, if it can be obtained and the field width permits, or a decimal representation otherwise.

group The effective group ID of the process. This will be the textual group ID, if it can be obtained and the field width permits, or a decimal representation otherwise.

rgroup The real group ID of the process. This will be the textual group ID, if it can be obtained and the field width permits, or a decimal representation otherwise.

pid The decimal value of the process ID.

ppid The decimal value of the parent process ID.

pgid The decimal value of the process group ID.

pcpu The ratio of CPU time used recently to CPU time available in the same period, expressed as a percentage. The meaning of ``recently'' in this context is unspecified. The CPU time vailable is determined in an unspecified manner.

vsz The total size of the process in virtual memory, in kilobytes.

nice The decimal value of the system scheduling priority of the process. See nice(1).

etime In the POSIX locale, the elapsed time since the process was started, in the form:
[[dd-]hh:]mm:ss
where

dd is the number of days
hh is the number of hours
mm is the number of minutes
ss is the number of seconds

time In the POSIX locale, the cumulative CPU time of the process in the form:

[dd-]hh:mm:ss

The dd, hh, mm, and ss fields will be as described in the etime specifier.

tty The name of the controlling terminal of the process (if any) in the same format used by the who(1) command.

comm The name of the command being executed (argv[0] value) as a string.

args The command with all its arguments as a string. The implementation may truncate this value to the field width; it is implementation-dependent whether any further truncation occurs. It is unspecified whether the string represented is a version of the argument list as it was passed to the command when it started, or is a version of the arguments as they may have been modified by the application. Applications cannot depend on being able to modify their argument list and having that modification be reflected in the output of ps. The Solaris implementation limits the string to 80 bytes; the string is the version of the argument list as it was passed to the command when it started.

f Flags (hexadecimal and additive) associated with the process.

s The state of the process.

c Processor utilization for scheduling (obsolete).

uid The effective user ID number of the process as a decimal integer.

ru+id The real user ID number of the process as a decimal integer.

gid The effective group ID number of the process as a decimal integer.

rgid The real group ID number of the process as a decimal integer.

projid The project ID number of the process as a decimal integer.

project The project ID of the process as a textual value if that value can be obtained; otherwise as a decimal integer.

sid The process ID of the session leader.

taskid The task ID of the process.

class The scheduling class of the process.

pri The priority of the process. Higher numbers mean higher priority.

opri The obsolete priority of the process. Lower numbers mean higher priority.

lwp The decimal value of the lwp ID. Requesting this formatting option causes one line to be printed for each lwp in the process.

nlwp The number of lwps in the process.

psr The number of the processor to which the process or lwp is bound.

pset The ID of the processor set to which the process or lwp is bound.

addr The memory address of the process.

osz The total size of the process in virtual memory, in pages.

wchan The address of an event for which the process is sleeping (if -, the process is running).

stime The starting time or date of the process, printed with no blanks.

rss The resident set size of the process, in kilobytes.

pmem The ratio of the process's resident set size to the physical memory on the machine, expressed as a percentage.

fname The first 8 bytes of the base name of the process's executable file.

Ahora, construí un pequeño script que ejecuta

ps -ea -o pid,pcpu,pmem,vsz,osz,rss,etime,s,args | awk '{print $1","$2","$3","$4","$5","$6","$7","$8",\""$9,$10,$11,$12,$13,$14,$15"\""}' > estadoDeMemoria.csv

De manera que obtengo el PID, el porcentaje de uso del procesador, el porcentaje del uso de la memoria, el tamaño de la memoria virtual usada en Kbs, el tamaño de memoria virtual en páginas, el tamaño del segmento residente, el tiempo que lleva de ejecución, el estado del proceso y el comando asociado junto con sus argumentos.
Esto, más crontab y grep devuelven una cantidad considerable de información, sobre todo si lo ejecutamos cada cinco minutos.

Ahora, suponiendo que en cada ejecución se genere un archivo de salida (estadoDeMemoria.csv), podremos también crear un script que parsee las salidas a fin de obtener un concentrado que sólo indique los cambios y los porcentajes de uso por cada rubro.

Ya sé, ya sé, hay productos por ahí que ya hacen eso, pero ¿no es agradable crear tu propio monitor de procesos casero?

¡Saludos!

1 comentario:

Anónimo dijo...

Suave hermano, bueno tu script