DataStage environments
DataStage offers six PX environments that you can use to run your jobs. A Default DataStage PX S runtime environment is started when you run a data flow as a job. However, before you run the flow as a job, you can update the environment to any of the six environments that are available.
All runtimes consume capacity unit hours (CUHs) that are tracked. Only the time it takes to run jobs is tracked. Creating, configuring, and updating flows on the canvas does not use any CUHs.
Transforming data
When you run a job to extract, transform, or load data in DataStage, a Default DataStage XS runtime is started automatically and is listed as an active runtime on the Environments page of your project.
Running a flow
You can create a job in which to run your DataStage flow:
- Directly on the DataStage canvas by clicking the run icon from the DataStage toolbar
- From your project’s DataStage flows page by selecting the DataStage flow and clicking the Action menu and selecting Create job.
Environment options in jobs
When you create a job in which to run a DataStage flow, you can select one the following preset environments:
| Name | Hardware configuration |
|---|---|
Default DataStage PX S |
1 Conductor: 1vCPU and 4 GB RAM; X |
Default DataStage PX M |
1 Conductor: 2 vCPU and 8 GB RAM; X |
Default DataStage PX L |
1 Conductor: 4 vCPU and 16 GB RAM; X |
Default DataStage PX (MPP) S |
1 Conductor: 1 vCPU and 4 GB RAM; 2 computes: 4 vCPU and 16 GB RAM |
Default DataStage PX (MPP) M |
1 Conductor: 1 vCPU and 4 GB RAM; 4 computes: 4 vCPU and 16 GB RAM |
Default DataStage PX (MPP) L |
1 Conductor: 1 vCPU and 4 GB RAM; 8 computes: 4 vCPU and 16 GB RAM |
The Default DataStage PX S runtime is used when you run a job to extract, transform, and load data in DataStage, unless you select a different environment.
Select the Default DataStage PX S runtime and other environments that have less compute nodes, CPU, and RAM if you run jobs that operate on small data sets. For jobs that include large data sets or many stages, select plans with more CPU and memory so jobs run faster.
To update the environment that you want to use:
- On the flow canvas, select the run settings icon and select the environment that you want to use.
- Select a job, edit the job configuration, and on the run settings tab, change the environment.
Runtime logs for jobs
To view the accumulated logs for a DataStage job:
- From the project’s Jobs page, click the job that ran the DataStage flow for which you want to see logs.
- Click the job run. You can view the job log, copy the log to clipboard, or download the log.