What is PICS?
PICS is a Public IaaS Cloud Simulator and is designed for evaluating the performance of both public IaaS clouds and cloud applications without actual deployment of cloud applications.
- Main Capabilities:
- Assessing a wide range of properties of cloud services and cloud application, including the cloud cost, job response time, and resource utilizations.
- Allowing PICS users to specify different workloads types, including dynamic job arrival patterns and SLA requirements (e.g. Job Deadline).
- Simulating of a broad range of resource management policies: i.e., horizontal and vertical auto scaling, custom job scheduling policies, and job failure cases.
- Enabling PICS users to evaluate the performance of different types of current public IaaS cloud configurations such as a variety of resource types (VM instances and storage services), unique billing models, and performance uncertainty.
- Design Goal
- Precise modeling of the behavior of public cloud providers (including a variety of cloud resources. e.g., VM, storage, and network).
- Precise modeling of the behavior of public cloud application (including dynamic workload changes and performance uncertainty).
- Precise modeling of the behavior of cloud users' resource management policies.
- Simulation Inputs (PICS Configurations) PICS inputs consist of three configurations files.
- General PICS configurations (./config/config.txt)
- Simulator Configurations
- Public IaaS Configurations (Billing, VM, Storage, and Network)
- Job Management Configurations (Scheduling, Failure Management)
- VM Management Configurations (VM Selection, Scaling)
- VM configurations (./config/vm_config.txt)
- VM Type and Price
- VM CPU and Network Performance
- Workload configurations (./workload/workload.csv)
- Job Arrival, Deadline, Duration, Data Usage Information
- Simulator Configurations
- Public IaaS Configurations
- Job Management Configurations
- VM Management Configurations
- Workload file example
The goal of PICS is to precisely simulate the behaviors of public IaaS clouds the cloud users' perspectives as if they deply a particular cloud application on public IaaS clouds.Design Challenges:
# A. Simulation Configuration # A.1 Sim Trace Interval (e.g. every 60 sec) SIM_TRACE_INTERVAL=60 # A.2 Workload File Path WORK_LOAD_FILE=workload.csv # A.3 VM Configuration Path VM_CONFIG_FILE=vm_config.txt
# B. Public IaaS Configuration # B.1 VM Billing Config - Set IaaS Pricing Model (Hour or Min-based) VM_BILLING_TIME_UNIT=BTU_HOUR # BTU_HOUR: Hourly Billing Model (e.g. Amazon Web Services) # BTU_MIN : Minutely Billing Model (e.g. MS Azure & Google Compute Engine) # B.1 VM Billing Conf - Set Billing Time Period (Int and > 0) VM_BILLING_TIME_PERIOD=1 # IaaS Billing Model = VM_BILLING_TIME_UNIT * VM_BILLING_TIME_PERIOD=1 # e.g. VM_BILLING_TIME_UNIT=BTU_MIN and VM_BILLING_TIME_PERIOD=30 # --> 30min based billing # Note that Unit Cost($) for each VM is defined at VM_CONFIG_FILE # B.2 VM Startup Delay (Lagtime) - Min Startup LagTime for Creating a new VM MIN_STARTUP_LAG=30 # B.2 VM Startup Delay (Lagtime) - Max Startup LagTime for Creating a new VM MAX_STARTUP_LAG=60 # PICS determines startup lagtime for creating a new VM # from MIN_STARTUP_LAG (>=0) to MAX_STARTUP_LAG (>=0) # (MAX_STARTUP_LAG >= MIN_STARTUP_LAG) # C. Cloud Storage Configurations # C.1 Max Volumn of Cloud Storage # Unit: Mega Bytes MAX_CAPACITY_OF_STORAGE=10240000 # C.2 Storage Usage Cost ($) for Gigabytes/Month # STORAGE_UNIT_COST > 0 STORAGE_UNIT_COST=0.1 # C.3 Storage Billing Time Unit (Second, > 0) # 1 : 1sec # 60 : 1 min # 3600 : 1 Hour # 86400 : 1 Day # 2592000: 1 Month STORAGE_BILLING_TIME_UNIT=1 # D. Network Configuration # D.1 Network Bandwidth for Data Transfer (Unit: MB/s) # D.1.1 Bandwidth from Cloud to Cloud PERF_DATA_TRANSFER_CLOUD=3.0 # D.1.2 Bandwidth from Incoming Traffic PERF_DATA_TRANSFER_IN=2.0 # D.1.3 Bandwidth for Outgoing Traffic PERF_DATA_TRANSFER_OUT=1.0 # D.2 Network Cost for Data Transfer (Unit:$) # D.2.1 Network Cost from Cloud to Cloud COST_DATA_TRANSFER_CLOUD=0.01 # D.2.2 Network Cost for Incmoing Traffic COST_DATA_TRANSFER_IN=0.02 # D.2.3 Network Cost for Outgoing Traffic COST_DATA_TRANSFER_OUT=0.03
# E. Job Management Configurations # E.1 Job Scheduling Configuration JOB_ASSIGNMENT_POLICY=EDF # E.2 Job Failure Configurations # E.2.1 Probability for Job Failure Occurance # 0 <= PROB_JOB_FAILURE <= 1 # e.g. 0.05: 5% PROB_JOB_FAILURE=0.05 # E.3 Job Failure Recovery Policy # JF-POLICY-01: ignore the failed job # JF-POLICY-02: re-execute the failed job # JF-POLICY-03: move the failed job to end of the job queue # JF-POLICY-04: find another VM (running or new) to satisfy # the failed job deadline JOB_FAILURE_POLICY=JF-POLICY-01
VM Configurations Detail (./config/vm_config.txt)
# F. VM Management Configuration # F.1 VM Selection Policy for VM Scaling-up # VM-SEL-COST : Cost based VM Selection # VM-SEL-PERF : Performance based VM Selection # VM-SEL-COSTPERF : Cost/Performance Balanced VM Selection VM_SELECTION_METHOD=VM-SEL-COST # F.2 Max Num of Concurrent VMs # MAX_NUM_OF_CONCURRENT_VMS is # > 0 or UNLIMITED MAX_NUM_OF_CONCURRENT_VMS=UNLIMITED # F.3 VM Scale Down Policy # SD-IM : Immediate VM Scale-down when the VM is idle # SD-HR : Hourly Billing Model-based Scale-Down (e.g. AWS) # SD-MN : Minutely Billing Model-based Scale-Down (e.g. MS Azure) # SD-SL : Startup-Lag based Scale-Down # SD-JAT-MEAN : Mean Job Arrival Rate-based Scale Down # SD-JAT-MAX : Maximum Job Arrival Rate-based Scale Down # SD-JAT-MEAN-RECENT: Mean Recent Job Arrival Rate-based Scale Down (kNN) # SD-JAT-MAX-RECENT : Max Recent Job Arrival Rate-based Scale Down # SD-JAT-SLR : Simple Linear Regression (JAT)-based Scale Down # SD-JAT-2PR : Quadratic Regression (JAT)-based Scale Down # SD-JAT-3PR : Qubic Regression (JAT)-based Scale Down # SD-JAT-LLR : Local Linear Regression (JAT)-based Scale Down # SD-JAT-L2PR : Local Quadratic Regression (JAT)-based Scale Down # SD-JAT-L3PR : Local Qubic Regression (JAT)-based Scale Down # SD-JAT-WMA : Weighted Moving Average (JAT)-based Scale Down # SD-JAT-ES : Exponential Smoothong (JAT)-based Scale Down # SD-JAT-HWDES : Holt-Winters Double Exponential Smoothing (JAT)-based # SD-JAT-BRDES : Brown's Double Exponential Smoothing (JAT)-based # SD-JAT-AR : Autoregressive-based Scale Down # SD-JAT-ARMA : Autoregressive and Moving Average-based Scale Down # SD-JAT-ARIMA : Autoregressive Integrated Moving Average-based VM_SCALE_DOWN_POLICY_NAME=SD-IM # F.4 VM Scale Down Policy Unit # This configuration is only applicable for # billing model based Scale Down # (e.g. SL-HR and SL-MN) # e.g. VM_SCALE_DOWN_POLICY_NAME=SD-MN and VM_SCALE_DOWN_POLICY_UNIT=10 # --> 10 min based Scale Down VM_SCALE_DOWN_POLICY_UNIT=1 # F.5 Num of Recent Sample for SD Policies # This configuration is applicable for RECENT-based SD policies. # (e.g. SD-JAT-*-RECENT) VM_SCALE_DOWN_POLICY_RECENT_SAMPLE_CNT=50 # F.6 First Parameter for Timeseries. # alpha for WMA, ES, HWDES, BRDES: 0 < alpha < 1 # p for AR, ARMA, ARIMA (p >= 0) VM_SCALE_DOWN_POLICY_PARAM1=0.5 # F.7 Second Parameter for Timeseries. # beta for HWDES (0 < beta < 1) # q for ARMA and ARIMA (q >= 0) VM_SCALE_DOWN_POLICY_PARAM2=0.5 # F.8 Third Parameter for Timeseries. # d for ARIMA (d >= 0) VM_SCALE_DOWN_POLICY_PARAM3=2 # F.9 MIN/MAX for Wait Time of VM Scale Down # These MIN/MAX fields are related to predictive methods such as SD-JAT-SLR. # To handle wrong prediction results # --> too short (or negative) or too long wait time VM_SCALE_DOWN_POLICY_MIN_WAIT_TIME=1 VM_SCALE_DOWN_POLICY_MAX_WAIT_TIME=UNLIMITED # F.10 Vertical Scaling # Vertical Scaling - Enable: YES, Disable: No # When enabling Verticaling, MAX_NUM_OF_CONCURRENT_VMS # shouldn't be UNLIMITED ENABLE_VERTICAL_SCALING=NO # F.11 Vertical Scaling Operation # VSCALE-UP : Only VScale-up # (triggered when VM cannot meet deadline for queued jobs) # VSCALE-DOWN : Only VScale-down # (triggered when VM meets deadline - find most suitable one # for queued jobs (e.g. cheapest VM with deadline satisfaction) # VSCALE-BOTH : Both VScale-up/down # F.12 Vertical Scaling Options VERTICAL_SCALING_OPERATION=VSCALE-BOTH
# Number of VM types used in PICS simulation and n > 0 NO_OF_VM_TYPES=n # First VM Type Name VM1_TYPE_NAME=t2.micro # First VM Unit Price ($) VM1_UNIT_PRICE=0.1 # First VM CPU Performance # Used to calculate job duration on VM type # Less value for CPU factor is better VM1_CPU_FACTOR=2.0 # First VM Network Performance # Used to calculate data transfer rate on VM type # Less value for NET factor is better VM1_NET_FACTOR=2.0 # Second VM Type Name VM2_TYPE_NAME=t2.micro # Second VM Unit Price ($) VM2_UNIT_PRICE=0.2 # Second VM CPU Performance VM2_CPU_FACTOR=1.5 # Second VM Network Performance VM2_NET_FACTOR=1.5 ... # nth VM Type Name VMn_TYPE_NAME=nth_VM_Type # nth VM Unit Price ($) VMn_UNIT_PRICE=1.0 # nth VM CPU Performance VMn_CPU_FACTOR=1.0 # nth VM Network Performance VMn_NET_FACTOR=1.0
read more workload descriptions...
#job_submit_interval,job_duration,job_deadline,input_data,output_data 100,200,1000,NONE_0,NONE_0 100,200,1500,NONE_0,IC_524288 100,200,2000,NOne_0,OC_524288 100,200,2500,IC_524288,NONE_0 100,200,3000,IC_524288,IC_524288 100,200,3500,IC_524288,OC_524288 100,200,4000,OC_524288,NONE_0 100,200,4500,OC_524288,IC_524288 100,200,5000,OC_524288,OC_524288
- job_submit_interval: this means job generation interval (unit: PICS simulation clock - second). e.g. 100,200,1000,NONE_0,NONE_0
First job will be generated at 100 simulation seconds. Next job will be generated at 200 seconds. ==> (100 seconds after the previous job generation.
Actual job duration on each VM is calculated by standard duration * each VM's CPU_FACTOR Actual job duration on a VM (CPU_FACTOR=2.0) is 400 (200 * 2.0)
NONE_0: No input data (size is zero). IC_xxx: xxx mega bytes of input data, transfer direction: Cloud => Cloud. OC_xxx: xxx mega bytes of input data, transfer direction: Outside => Cloud.
NONE_0: No output data (size is zero). IC_xxx: xxx mega bytes of output data, transfer direction: Cloud => Cloud. OC_xxx: xxx mega bytes of output data, transfer direction: Cloud => Outside.
(*) You can find report files at "Logs/pics_log-YYYY-MM-DD-hh-mm-ss/Report/"
(*) In most cases, the following THREE (*) result files are the most important ones:
read more simulation resutls...
- 1.report_simulation_trace_broker.csv provides
- real time trace for incoming workloads. (e.g. JOB_RECV(CUMM) and JOB_RECV(UNIT))
- real time trace for workload completion. (e,g, JOB_COMP(CUMM) and JOB_COMP(UNIT))
- real time trace for VM usage status and cost. (e.g. VM_*)
- Meaning of all attributes for 1.report_simulation_trace_broker.csv
- CLOCK: Simulation clock.
- JOB_RECV(CUMM): The accumulated numbers of received jobs until the current simulation clock.
- JOB_RECV(UNIT): The number of received jobs at the simulation clock.
- JOB_COMP(CUMM): The accumulated numbers of completed jobs until the current simulation clock.
- JOB_COMP(UNIT): The number of completed jobs at the simulation clock.
- VM_RUN: The number of currently running (VM_STUP + VM_ACT) VMs including currently starting up VMs and active VMs, and not including stopped VMs (VM_STOP).
- VM_STUP: The number of currently starting up VMs.
- VM_ACT: The number of currently active VMs.
- VM_STOP: The number of currently stopped VMs.
- VM_COST($): The accumulated VM cost at the simulation clock.
- 2.report_simulation_trace_iaas.csv provides
- real time trace for cloud usage and cost.
- real time trace for network usage and cost.
- Meaning of all attributes for 2.report_simulation_trace_iaas.csv
- CLOCK: Simulation clock.
- # SC: The number of storage containers (e.g. S3 buckets).
- # SFO: The number of storage file objects (e.g. total # of files in all S3 buckets).
- ST_SIZE (KB): The current size (Kilo Bytes) of cloud storage (e.g. S3) at the simulation clock.
- ST_COST ($): The current cost of cloud storage (e.g. S3) at the simulation clock.
- NET-IN (KB): The amount of data transmission (outside of clouds --> clouds (IaaS data center)) at the simulation clock.
- NET-OUT (KB): The amount of data transmission (clouds (IaaS data center)--> outside of clouds) at the simulation clock.
- NET-CLOUD (KB): The amount of data transmission (clouds <--> clouds in the same data center) at the simulation clock.
- NET-IN_COST ($): The network cost for item #6.
- NET-OUT_COST ($): The network cost for item #7.
- NET-CLOUD_COST ($): The network cost for item #8.
- NET_COST ($): The total cost for network usage: item #9 + item #10 + item #11.
- 3.report_job_complete_report.csv provides
- detailed information for each workload processing.
- CPU time -- e.g., item #5: CPU
- Network time -- e.g., item #4 and #6: IN/OUT
- Deadline satisfaction -- e.g., item #15: DF
- Total duration -- e.g. item #13: TD and item #14: RT
- Cost for each workload processing -- e.g. item #16: CO($)
- Meaning of all attributes for 3.report_job_complete_report.csv
- ID: Job ID.
- JN: Job Name.
- ADR: Actual Job Duration.
- IN: Network time for data transmission to VM before this job processing.
- CPU: CPU time for the job processing.
- OUT: Network time for data transmission from VM to outside of VM after this job processing (e.g. output file data transfer).
- DL: Job deadline.
- VM: Assigned VM ID for this job processing.
- TG: Time for job generation. (job ingress time.)
- TA: Time for job assignment to the particular VM.
- TS: Time for job processing start.
- TC: Time for job processing completion.
- TD: Total duration for job processing. (TC - TG)
- RT: Job runtime. (TC - TS)
- DF: Difference from job deadline
- Positive value: job deadline satisfaction.
- Negative value: job deadline miss.
- CO($): Cost for this job processing.
- ST: Job state.
- JOB_ST_COMPLETED (3004): this job is successfully completed.
- JOB_ST_FAILED (3005): this job is failed.
- 4.report_vm_usage_report.csv provides
- detailed information for VM usage.
- VM usage cost -- e.g. item #3: CO($)
- VM running time -- e.g. item #2: RT
- VM utilization -- e.g. item #13: UT
- # of processed jobs -- e.g. item #11: NJ
- Vertical scaling decision -- e.g. item #18, #19, and #20.
- Meaning of all attributes for 4.report_vm_usage_report.csv
- VMID: VM (Virtual Machine) ID.
- RT: VM runtime.
- CO($): VM Cost.
- IID: VM Instance ID.
- TY: VM Type. (e.g. m3.xlarge)
- ST: VM State.
- VM_ST_CREATING (3101): This VM is currently creating.
- VM_ST_ACTIVE (3102): This VM is currently running (active).
- VM_ST_TERMINATE (3103): This VM is terminated.
- TC: Time for VM is created.
- TA: Time for VM is activated.
- TT: Time for VM is terminated.
- SL: Startup lag time for this VM.
- NJ: The number of jobs processed by this VM.
- JR: Job runtime on this VM.
- UT: VM Utilization (e.g. 0.9 = 90% of utilization)
- SR: Startup portion of total VM running time.
- ID: Idle portion of total VM running time.
- LJCT: Simulation clock for the last job completion on this VM.
- SDWT: Scale down wait time -- wait time before termination of this VM.
- IS_VS_VICTIM: True if this VM is a victim for vertical scaling up. False if this is not eligible for vertical scaling.
- VS_CASE: Case for vertical scaling.
- VS_VICTIM_ID: -1 if this is not related to vertical scaling. if not -1, this VM is vertical scaling case and this field marks the victim of vertical scaling.
- 5.report_storage_usage.csv provides
- detailed information for Cloud Storage usage including storage usage time, cost, and volumn size.
- Meaning of all attributes for 5.report_storage_usage.csv
- If TY is SC: this is information for storage container. (e.g. S3)
- ID: Storage Container ID. (e.g. S3 ID)
- CR: Created Job ID for this storage container.
- TR: An simulation entity that terminates this storage container.
- RG: Storage container region.
- PM: Permission for this storage. (e.g. SC_PERMISSION_PUBLIC (4001): public, SC_PERMISSION_PRIVATE (4002): private, SC_PERMISSION_GROUP (4003): group permission)
- ST: State for the storage container.
- CT: Time for creation.
- DT: Time for deletion.
- DR: Duration for this storage container is active.
- NF: The number of stored files.
- VL(KB): The volume for storage container.
- CO($): Cost for this storage container.
- If TY is SFO: this is information for file object in particular storage container.
- ID: File object ID.
- SC: Storage container ID/
- SZ(KB): File object size (KB).
- ON: File created Job ID.
- ST: File status.
- DST: Data status.
- PSZ(KB): File (planned) size.
- CT: File creation time.
- AT: File activated time.
- DT: File deleted time.
- DR: File active duration.
- CO($): File storage cost.
- If TY is SC: this is information for storage container. (e.g. S3)
- 6.report_network_usage.csv provides
- detailed information for Network usage including network cost for incoming/outgoing data transfer.
- Meaning of all attributes for 6.report_network_usage.csv
- JOBID: Job ID for network usage.
- IN_TS(KB): Input file size.
- IN_DR: Input file flow direction. (e.g., IFTD_IC: input file is from inside clouds, IFTD_OC: input file is from outside of clouds.)
- IN_COST($): Network usage cost for input file.
- OUT_TS_PLANNED(KB): Output file size (planned).
- OUT_TS_ACTUAL(KB): Output file size (actual).
- OUT_COST($): Network usage cost for output file.
- TOTAL_COST($): Total network cost for input/output files (IN_COST($) + OUT_COST($))
PICS Validation Results
In order to validate the correctness of PICS simulation, we have compared PICS with real-world cloud application on Amazon Web Services.
- Baseline Information
- Baseline Public Cloud: Amazon Web Services (focusing on EC2 instances and S3 storage)
- Baseline Cloud Application: Hadoop (MapReduce) Application
- Workload Patterns: Steady, Bursty,, Poisson-based Random Workload Pattern.
- Job Scheduling: EDF Scheduling
- VM Selection: Cost-based VM Selection
- VM Scaling: Horizontal (Scale-Out/In) and Vertical (Scale-Up/Down) Scaling
- PICS Validation Results
- Cost Traces
- Bursty Workload
- Random(Poisson) Workload
- VM Utilization
- VM Scaling
- Bursty Workload
- Random(Poisson) Workload
- Job Deadline
- Steady Workload
- Bursty Workload
Setup and PICS Execution
- Prerequisite Packages
- Run PICS Simulator
$ wget https://github.com/ik2sb/PICS/archive/master.tar.gz or master.zip $ tar xvfz master.tar.gz or unzip master.zip $ cd PICS-master $ // Configure your simulation setting. $ // - General Configuration: ./config/config.txt $ // - VM Configuration: ./config/vm_config.txt $ // - Workload Configuration: ./config/workload.txt $ python run_simulation.py
Project Team Members
- Prof. Marty Humphrey (Computer Science at University of Virginia)
- Prof. Wei Wang (Computer Science at University of Texas at San Antonio)
- In Kee Kim (Computer Science at University of Virginia)
- In Kee Kim, Wei Wang, and Marty Humphrey, "PICS: A Public IaaS Cloud Simulator", 8th IEEE International Conference on Cloud Computing (IEEE CLOUD 2015)
If you need any technical support for PICS, please contact In Kee Kim (firstname.lastname@example.org).