Assignment 1
Learning Objectives
- read and write simple text files
- load data selectively
- control conditional and iterative execution
- create data frames and vectors
- make R code readable
Data Files
Tasks
Before diving into the programming problems, study the data files that are provided for the assignment:
Notes
ADDITIONAL DATA DESCRIPTION: Each record within the AirlinesDelays.txt is for a specific flight. There are seven columns that represent a delay for a flight: DEP_DEL, ARR_DELAY, CARRIER_DELAY, WEATHER_DELAY, NAS_DELAY, SECURITY_DELAY and LATE_AIRCRAFT_DELAY. You can tally the number of delays for a flight by counting the number of these variables that are greater than 0.
CODING STYLE: Part of the assignment is to practice good coding standards: consistent naming, formatting with indentation, testing input values, dealing with errors, commenting on the code, etc.. You need to submit an R script, which is a text file having an .R extension, i.e., To help you and us read the code you must follow the coding standards mentioned above. This applies to ALL assignments going forward. If your code is not readable and/or does not run 20% will be automatically deducted from your assignment grade.
CODE FUNCTIONALITY: The functions must return the result as a return value which is then printed from the calling function or main script. This makes the function truly a "function" versus a "procedure". Functions return values, procedures do work and do not return values. Returning a value also makes it easier to deal with error conditions. For example, you could return -1 if the data could not be found. The calling function can then test for that condition.
The file can either be uncompressed after it's downloaded or you can use various R functions to load a zipped (compressed) file.
Test Cases
TBD
- (10 Points) Load the data file into an appropriate data object of your choice. The file is compressed, so you can either uncompress the file after downloading or use one of the various R functions to load a zipped (compressed) file. Look at the other questions to determine what object is most appropriate: data frame versus vector? Figure out how to deal with missing values. Add your strategy as a comment to the functions that you write in the next few tasks.
- (30 Points) Write a function called AvgArrDelayByCarriers(Carrier) that average arrival delay of the carrier passed to the function. You must exclude any missing or negative (early) arrival delay.
- (30 Points) Write a function called ProbDepartureDelaysByOrigin(Origin) that calculates the probability of a departure delays for a particular airport. It should only count departure delays. The probability is the fraction of flights that have a positive departure delay.
- (30 Points) Write a function called AvgFlightDelay(Dep, Dest) that calculates and returns the average arrival delay in minutes for a flight between two airports. Assume a delay of 0 for any missing delay value and count all delays, including negative delays (i.e., early departure or arrival).
Notes
ADDITIONAL DATA DESCRIPTION: Each record within the AirlinesDelays.txt is for a specific flight. There are seven columns that represent a delay for a flight: DEP_DEL, ARR_DELAY, CARRIER_DELAY, WEATHER_DELAY, NAS_DELAY, SECURITY_DELAY and LATE_AIRCRAFT_DELAY. You can tally the number of delays for a flight by counting the number of these variables that are greater than 0.
CODING STYLE: Part of the assignment is to practice good coding standards: consistent naming, formatting with indentation, testing input values, dealing with errors, commenting on the code, etc.. You need to submit an R script, which is a text file having an .R extension, i.e., To help you and us read the code you must follow the coding standards mentioned above. This applies to ALL assignments going forward. If your code is not readable and/or does not run 20% will be automatically deducted from your assignment grade.
CODE FUNCTIONALITY: The functions must return the result as a return value which is then printed from the calling function or main script. This makes the function truly a "function" versus a "procedure". Functions return values, procedures do work and do not return values. Returning a value also makes it easier to deal with error conditions. For example, you could return -1 if the data could not be found. The calling function can then test for that condition.
The file can either be uncompressed after it's downloaded or you can use various R functions to load a zipped (compressed) file.
Test Cases
TBD
Deliverables & Submission Instructions
You need to submit an .R extension file or an R Notebook. Be sure to state all the assumptions and give explanations as comments in the .R file wherever needed to help us assess your submission. Please name the submission file LAST_FirstInitial_1.R for example for John Smith’s assignment, the file should be named Smith_J_1.R. Note in the comments anything that does not work or you did not complete. Make sure that whatever you submit works; no credit will be given for code that does not work. Upload the submission to Blackboard. Make sure you follow the R Programming Style Guide.
Scoring
Total Number of Earnable Points: 100
Approximate Time to Complete: 2-3 hours
Due Date: see Calendar or Blackboard
Approximate Time to Complete: 2-3 hours
Due Date: see Calendar or Blackboard