ACKNOWLEDGEMENT First of all I want to thank my supervisor Mr Narendra Kumar, a programmer manager. I consider it as privilege to express my gratitude and respect to him who guided and inspired me in the successful completion of the project. There were many ups and downs in the duration, many times I succeeded and many times I failed. It was him who has given me encouragement and shown the right path to work. Without him I will not stand a day in the organization. The best support I got from my institute was its tag which was behind my name. I think without my college support I will not be able to even think about working in such an esteemed organization. The internship opportunity I had with ONGC was a great chance for learning and professional development. Therefore, I consider myself as a very lucky individual as I was provided with an opportunity to be a part of it. I am also grateful for having a chance to meet so many wonderful people and professionals who led me though this internship period. I am using this opportunity to express my deepest gratitude and special thanks to Mr.S.B.Shah () for allowing me to carry out my project at their esteemed organization. I express my deepest thanks to Mr.U.S. Pandey DGM (CGS Programming) for taking part in useful decision & giving necessary advices and guidance and arranged all facilities to make life easier. It is my radiant sentiment to place on record my best regards, deepest sense of gratitude to Mr.Bhanu Pratap Singh(Senior Programming Officer) for his careful and precious guidance which were extremely valuable for my study both theoretically and practically and also guiding me pro-actively throughout the training process, explaining the operation of the Data Centres, the responsibilities of the System Administrators and Support Groups, and for taking out time from their busy schedules to help me through the development of applications using the Python programming language its libraries and plotly. Finally I would like to thank my university for allowing me to get such a wonderful experience. I perceive as this opportunity as a big milestone in my career development. I will strive to use gained skills and knowledge in the best possible way, and I will continue to work on their improvement, in order to attain desired career objectives. Hope to continue cooperation with all of you in the future. AISHWARYA PRAKASH CENTRAL UNIVERSITY OF RAJASTHAN ROLL NO. 2016MSCS001 DATE-: 15TH FEB 2018 PLACE-: O.N.G.C Priyadarshani, Mumbai
INTRODUCTION Since my joining of the winter training programme on 14th December 2017 which has spanned over eight weeks, I have had the opportunity to work on the development of a wide array of technical projects which are system-based. Over the course of my training, I have learnt various aspects of professional software development like secure coding practices adhering to the latest industry standards, writing tests to break and fix code and most importantly to write quality code. I also learned some basics of the Linux operating system (Ubuntu) and I had the opportunity to learn a new programming language python. All applications were developed primarily using Python and on Ubuntu OS architecture. My first task was to develop health check-ups of geophones which display the variations of resistance in geophones for the particular day using plotly which uses python and pandas which could serve the custom needs of geophysicist throughout the world. My second task was to develop an admin program which gives information about the system that the user is logged into which could serve the custom needs of users here at ONGC. Through the following pages of this report I attempt to summarise the projects I have developed over the course of my winter training.
ABOUT ONGC Oil and Natural Gas Corporation Limited (ONGC) is an Indian multinational oil and gas company headquartered in Dehradun, Uttarakhand, India. It is a Public Sector Undertaking (PSU) of the Government of India, under the administrative control of the Ministry of Petroleum and Natural Gas. It is India's largest oil and gas exploration and production company. It produces around 69% of India's crude oil (equivalent to around 30% of the country's total demand) and around 62% of its natural gas. ONGC was founded on 14 August 1956 by Government of India. Its international subsidiary ONGC Videsh currently has projects in 17 countries. ONGC has discovered 6 of the 7 commercially producing Indian Basins, in the last 50 years, adding over 7.1 billion tonnes of In-place Oil & Gas volume of hydrocarbons in Indian basins. ONGC went offshore in early 70's and discovered a giant oil field in the form of Bombay High, now known as Mumbai High. This discovery, along with subsequent discoveries of huge oil and gas fields in Western offshore changed the oil scenario of the country. ONGC has many offices throughout India. Dinesh K Sarraf is the Chairman & Managing Director of Oil and Natural Gas Corporation Ltd (ONGC), India's most valuable Maharatna public sector enterprise. • Maharatna ONGC is the largest producer of crude oil and natural gas in India, contributing around 70 per cent of Indian domestic production. The crude oil is the raw material used by downstream companies like IOC, BPCL, and HPCL to produce petroleum products like Petrol, Diesel, Kerosene, Naphtha, and Cooking Gas-LPG. • ONGC is India’s Top Energy Company and ranks 20th among global energy majors (Plants). • ONGC ranks 14th in ‘Oil and Gas operations’ and 220th overall in Forbes Global 2000. Acclaimed for its Corporate Governance practices, Transparency International has ranked ONGC 26th among the biggest publicly traded global giants. • It is one of the most valued public enterprise in India, and one of the highest profit-making and dividend-paying. ONGC has a unique distinction of being a company with in-house service capabilities in all areas of Exploration and Production of oil & gas and related oilfield services. Winner of the Best Employer award, a dedicated team of over 33,927 professionals toil round the clock in challenging locations. • Its wholly-owned subsidiary ONGC Videsh Limited (OVL) is the biggest Indian multinational in the energy space, participating in 36 oil and gas properties in 17 countries. ONGC subsidiary Mangalore Refinery and Petrochemicals Limited (MRPL) is a Schedule ‘A’ Miniratna, with a single-location refining capacity of 15 million tons per annum. • ONGC is ranked as the Top Energy Company in India, Fifth in Asia and 20th globally as per Platt’s Top 250 Global Energy Rankings, 2016; Maintains place as World's Third ranked E&P Company in the list.
• Ranked 464 in the Newsweek Green Rankings World's Greenest Companies 2016; Ranked 14th among global Oil and Gas Operations industry in Forbes Global 2000 list, 2016 of the World's biggest companies for 2016; Ranked 220 in the overall list, 2016 - based on Sales
VRC- VIRTUAL REALITY CENTRE “THIRD EYE CENTRE” AT VASUNDHARA BHAVAN, ONGC, BANDRA, MUMBAI Visualization with Virtual Reality (VR) Technology has emerged as a powerful tool in E&P industry for quick identification of depositional plays, improving efficiencies and reducing drilling risks. These are the things that I observed in Vasundhara Bhavan:DLP 3Chip Projectors DLP 3 Chip Projectors are used for high-performance, high brightness applications in large rooms such as lecture halls, digital cinemas and other large audience venues. 3 chip systems produce stunning images in almost any environment.3 Chip DLP technology is currently considered the top of the line technology for digital projection. Workstation A workstation is a high-end microcomputer designed for technical or scientific applications. The machine were intended primarily to be used by one person at a time. Workstations and applications designed for them are used by small engineering companies, architects, graphic designers, and any organization, department, or individual that requires a faster microprocessor, a large amount of random access memory (RAM), and special features such as high-speed graphics adapters. Historically, the workstation developed technologically about the same time and for the same audience as the UNIX operating system, which is often used as the workstation operating system. LTO 5 Drive Linear Tape-Open (or LTO) is a magnetic tape data storage technology. 42” Colour Plotter The plotter is a computer printer for printing vector graphics. In the past, plotters were used in applications such as computer-aided design, though they have generally been replaced with wide-format conventional printers. Plotters offered the fastest way to efficiently produce very large drawings or colour high-resolution vector-based artwork when computer memory was very expensive and processor power was very limited, and other types of printers had limited graphic output capabilities. Multifunction Printers An MFP (Multi Function Product/ Printer/ Peripheral), multifunctional, all-in-one (AIO), or Multifunction Device (MFD), is an office machine which incorporates the functionality of multiple devices in one, so as to have a smaller footprint in a home or business setting (the SOHO market segment), or to provide centralized document management/distribution/production in a large-office setting. A typical MFP may act as a
combination of some or all of the following devices: Printer, Scanner, Photocopier, Fax, Email. Network-attached storage (NAS)
It is file-level computer data storage connected to a computer network providing data access to heterogeneous clients. NAS not only operates as a file server, but is specialized for this task either by its hardware, software, or configuration of those elements. NAS is shared storage on a local area network. A NAS server is a storage appliance that consists of a high performance file server that plugs into a LAN. Multimode Optical Fibre Cable Multi-mode optical fibre (multimode fibre or MM fibre or fibre) is a type of Optical Fibre mostly used for communication over short distances, such as within a building or on a campus. Typical multimode links have data rates of 10Mbit/s to 10Gbit/s over link lengths of up to 600 meters. Image Generator –Servers Linux based application HP Z800 Workstation ( IG for Windows) HP Z800 Workstation delivers the extreme speed and massive expandability that you demand to tackle your biggest challenges. Whole-system computational power from a workstation that HP Z800 Workstation Combines ultimate performance with a revolutionary new industrial design, the HP Z800 Workstation delivers the extreme speed and massive expandability that is demanded to tackle your biggest challenges. optimizes the way the processor, memory, graphics, OS, and software technology work together. Display Systems The Christie Mirage WU14K-J 3-chip DLP projector offers exceptional image quality and detail in a compact size. Multi-Channel Amplifiers MCA Series Multi-Channel Amplifiers provide eight channels of power amplification. Model MCA 8050 delivers 50 watts/channel into 4 ohms. Channels may be bridged in pairs for higher combined wattage. Connections are provided for remote control of channel levels & muting.
High End Workstations Un-interrupted Power Supply DATA CENTRES GEOSIM CENTRE An Advanced Geological Modelling and Reservoir Simulation Centre (GeoSim) with an aim to consolidate and integrate MH Asset computing resources for G&G Applications, Modelling, Reservoir Simulations and to meet the ever increasing data size and computing requirements. The Centre is equipped with latest and advanced Hardware-Software (Geological Modelling and Reservoir Simulation) along with high speed LAN. This Centre is equipped with the latest and state-of-the art technologies viz Dell PowerEdge ML630 Blade servers, Dell PowerEdge R360 Rack Server, NetApp FAS8060 Network Attached Storage (NAS), Dell ML6020 Tape Library, Latest Net Vault Backup Software , etc. Geological and Geophysical (G&G) Applications runs on the server. The data is kept at NAS. Using Backup software – Net Vault, NAS Backup is taken in LT06 Magnetic Tapes.
Operating System Installation Firstly, we were told about various operating systems and then guided to install Ubuntu 16.04(LTS) operating system. We installed Ubuntu alongside windows on one our system given.
Step 1: Prepare Windows Machine for Dual-Boot
1. The first thing you need to take care is to create a free space on the computer hard disk in case the system is installed on a single partition. Login to your Windows machine with an administrative account and right click on the Start Menu -> Command Prompt (Admin) in order to enter Windows Command Line.
Preparing Windows for Dual-Boot with Ubuntu 16.04
2. Once in CLI, type diskmgmt.msc on prompt and the Disk Management utility should open. From here, right click on C: partition and select Shrink Volume in order to resize the partition. C:\Windows\system32\>diskmgmt.msc
Shrink Volume to Resize Partition
3. On Shrink C: enter a value on space to shrink in MB (use at least 20000 MB depending on the C: partition size) and hit Shrink to start partition resize as illustrated below (the value of space shrink from below image). Once the space has been resized you will see a new unallocated space on the hard drive. Leave it as default and reboot the computer in order to proceed with Ubuntu 16.04 installation.
Create Windows Partition for Ubuntu 16.04 Installation
Windows Partition for Dual Boot Ubuntu 16.04
Step 2: Install Ubuntu 16.04 with Windows Dual-Boot 4. Now it’s time to install Ubuntu 16.04. Go the download link from the topic description and grab Ubuntu Desktop 16.04 ISO image. Burn the image to a DVD or create a bootable USB stick using a utility such as Universal USB Installer (BIOS compatible) or Rufus (UEFI compatible). Place the USB stick or DVD in the appropriate drive, reboot the machine and instruct the BIOS/UEFI to boot-up from the DVD/USB by pressing a special function key (usually F12, F10 or F2 depending on the vendor specifications). Once the media boot-up a new grub screen should appear on your monitor. From the menu select Install Ubuntu and hit Enter to continue.
Ubuntu 16.04 Install Boot Screen
5. After the boot media finishes loading into RAM you will end-up with a completely functional Ubuntu system running in live-mode. On the Launcher hit on the second icon from top, Install Ubuntu 16.04 LTS, and the installer utility will start. Choose the language you wish to perform the installation and click on Continue button to proceed further.
Select Ubuntu 16.04 Installation Language
6. Next, leave both options from Preparing to Install Ubuntu unchecked and hit on Continue button again.
Preparing Ubuntu 16.04 Installation
7. Now it’s time to select an Installation Type. You can choose to Install Ubuntu alongside Windows Boot Manager, option that will automatically take care of all the partition steps. Use this option if you don’t require personalized partition scheme. In case you want a custom partition layout, check the something else option and hit on Continue button to proceed further. The option Erase disk and install Ubuntu should be avoided on dual-boot because is potentially dangerous and will wipe out your disk.
Select Ubuntu 16.04 Installation Type
8. On this step we’ll create our custom partition layout for Ubuntu 16.04. On this guide will recommend that you create two partitions, one for root and the other for home accounts data and no partition for swap (use a swap partition only if you have limited RAM resources or you use a fast SSD). We have to create the first partition, the root partition, select the free space (the shrink space from Windows created earlier) and hit on the + icon below. On partition settings use the following configurations and hit OK to apply changes: Size = at least 20000 MB Type for the new partition = Primary Location for the new partition = Beginning Use as = EXT4 journaling file system Mount point = /
Create Ubuntu 16.04 Partitions
Create Root Partition for Ubuntu 16.04
Create the home partition using the same steps as above. Use all the available free space left for home partition size. The partition settings should look like this: Size = all remaining free space Type for the new partition = Primary Location for the new partition = Beginning Use as = EXT4 journaling file system Mount point = /home
Create Home Partition for Ubuntu 16.04
9. When finished, hit the Install Now button in order to apply changes to disk and start the installation process. A pop-up window should appear to inform you about swap space. Ignore the alert by pressing on Continue button. Next a new pop-up window will ask you if you agree with committing changes to disk. Hit Continue to write changes to disk and the installation process will now start.
Confirm Partition Changes
Confirm Write Changes to Disk
10. On the next screen adjust your machine physical location by selecting a city nearby from the map. When done hit Continue to move ahead.
Select Your City Location
11. Next, select your keyboard layout and click on Continue button.
Select Keyboard Layout
12. Pick up a username and password for your administrative sudo account, enter a descriptive name for your computer and hit Continue to finalize the installation. This are all the settings required for customizing Ubuntu 16.04 installation. From here on the installation process will run automatically until it reaches the end.
Create User Account for Ubuntu 16.04
Ubuntu 16.04 Installation Process
13. After the installation process reaches its end hit on Restart Now button in order to complete the installation. The machine will reboot into the Grub menu, where for ten seconds, you will be presented to choose what OS you wish to use further: Ubuntu 16.04 or Microsoft Windows. Ubuntu is designated as default OS to boot from. Thus, just press Enter key or wait for those 10 seconds timeout to drain.
Ubuntu 16.04 Installation Completed
Grub Menu Select Ubuntu or Windows to Boot 14. After Ubuntu finishes loading, login with the credentials created during the installation process and enjoy it. Ubuntu 16.04 provides NTFS file system support automatically so you can access the files from Windows partitions just by clicking on the Windows volume.
Ubuntu 16.04 Login
Access Windows Partitions from Ubuntu 16.04
That’s it! In case you need to switch back to Windows, just reboot the computer and select Windows from the Grub menu.
Linux Commands Basic Commands 1. pwd — When you first open the terminal, you are in the home directory of your user. To know which directory you are in, you can use the “pwd” command. It gives us the absolute path, which means the path that starts from the root. The root is the base of the Linux file system. It is denoted by a forward slash( / ). The user directory is usually something like "/home/username".
2. ls — Use the "Is" command to know what files are in the directory you are in. You can see all the hidden files by using the command “ls -a”.
3. cd — Use the "cd" command to go to a directory. For example, if you are in the home folder, and you want to go to the downloads folder, then you can type in “cd Downloads”. Remember, this command is case sensitive, and you have to type in the name of the folder exactly as it is. But there is a problem with these commands. Imagine you have a folder named “Raspberry Pi”. In this case, when you type in “cd Raspberry Pi”, the shell will take the second argument of the command as a different one, so you will get an error saying that the directory does not exist. Here, you can use a backward slash. That is, you can use “cd Raspberry\ Pi” in this case. Spaces are denoted like this: If you just type “cd”and press enter, it takes you to the home directory. To go back from a folder to the folder before that, you can type “cd ..” . The two dots represent back.
4. mkdir & rmdir — Use the mkdir command when you need to create a folder or a directory. For example, if you want to make a directory called “DIY”, then you can type “mkdir DIY”. Remember, as told before, if you want to create a directory named “DIY Hacking”, then you can type “mkdir DIY\ Hacking”. Use rmdir to delete a directory. But rmdir can only be used to delete an emptydirectory. To delete a directory containing files, use rm.
+
6. touch — The touch command is used to create a file. It can be anything, from an empty txt file to an empty zip file. For example, “touch new.txt”.
7. man & --help — To know more about a command and how to use it, use the man command. It shows the manual pages of the command. For example, “man cd” shows the manual pages of the cd command. Typing in the command name and the argument helps it show which ways the command can be used (e.g., cd –help).
8. cp — Use the cp command to copy files through the command line. It takes two arguments: The first is the location of the file to be copied, the second is where to copy.
9. mv — Use the mv command to move files through the command line. We can also use the mv command to rename a file. For example, if we want to rename the file “text” to “new”, we can use “mv text new”. It takes the two arguments, just like the cp command.
10. locate — The locate command is used to locate a file in a Linux system, just like the search command in Windows. This command is useful when you don't know where a file is saved or the actual name of the file. Using the -i argument with the command helps to ignore the case (it doesn't matter if it is uppercase or lowercase). So, if you want a file that has the
word “hello”, it gives the list of all the files in your Linux system containing the word "hello" when you type in “locate -i hello”. If you remember two words, you can separate them using an asterisk (*). For example, to locate a file containing the words "hello" and "this", you can use the command “locate -i *hello*this”.
Intermediate Commands 1. echo — The "echo" command helps us move some data, usually text into a file. For example, if you want to create a new text file or add to an already made text file, you just need to type in, “echo hello, my name is alok >> new.txt”. You do not need to separate the spaces by using the backward slash here, because we put in two triangular brackets when we finish what we need to write.
2. cat — Use the cat command to display the contents of a file. It is usually used to easily view programs.
3. nano, vi, jed — nano and vi are already installed text editors in the Linux command line. The nano command is a good text editor that denotes keywords with color and can recognize most languages. And vi is simpler than nano. You can create a new file or modify a file using this editor. For example, if you need to make a new file named "check.txt", you can create it by using the command “nano check.txt”. You can save your files after editing by using the sequence Ctrl+X, then Y (or N for no). In my experience, using nano for HTML editing doesn't seem as good, because of its color, so I recommend jed text editor. We will come to installing packages soon.
4. sudo — A widely used command in the Linux command line, sudo stands for "SuperUser Do". So, if you want any command to be done with administrative or root privileges, you can use the sudo command. For example, if you want to edit a file like viz. alsa-base.conf, which needs root permissions, you can use the command – sudo nano alsa-base.conf. You can enter the root command line using the command “sudo bash”, then type in your user password. You can also use the command “su” to do this, but you need to set a root password before that. For that, you can use the command “sudo passwd”(not misspelled, it is passwd). Then type in the new root password.
5. df — Use the df command to see the available disk space in each of the partitions in your system. You can just type in df in the command line and you can see each mounted partition and their used/available space in % and in KBs. If you want it shown in megabytes, you can use the command “df -m”.
6. du — Use du to know the disk usage of a file in your system. If you want to know the disk usage for a particular folder or file in Linux, you can type in the command df and the name of the folder or file. For example, if you want to know the disk space used by the documents folder in Linux, you can use the command “du Documents”. You can also use the command “ls -lah” to view the file sizes of all the files in a folder.
7. tar — Use tar to work with tarballs (or files compressed in a tarball archive) in the Linux command line. It has a long list of uses. It can be used to compress and uncompress different types of tar archives like .tar, .tar.gz, .tar.bz2,etc. It works on the basis of the arguments given to it. For example, "tar -cvf" for creating a .tar archive, -xvf to untar a tar archive, tvf to list the contents of the archive, etc. Since it is a wide topic, here are some examples of tar commands. 8. zip, unzip — Use zip to compress files into a zip archive, and unzip to extract files from a zip archive. 9. uname — Use uname to show the information about the system your Linux distro is running. Using the command “uname -a” prints most of the information about the system. This prints the kernel release date, version, processor type, etc.
10. apt-get — Use apt to work with packages in the Linux command line. Use apt-get to install packages. This requires root privileges, so use the sudocommand with it. For example, if you want to install the text editor jed (as I mentioned earlier), we can type in the command
“sudo apt-get install jed”. Similarly, any packages can be installed like this. It is good to update your repository each time you try to install a new package. You can do that by typing “sudo apt-get update”. You can upgrade the system by typing “sudo apt-get upgrade”. We can also upgrade the distro by typing “sudo apt-get dist-upgrade”. The command “apt-cache search” is used to search for a package. If you want to search for one, you can type in “apt-cache search jed”(this doesn't require root).
11. chmod — Use chmod to make a file executable and to change the permissions granted to it in Linux. Imagine you have a python code named numbers.py in your computer. You'll need to run “python numbers.py” every time you need to run it. Instead of that, when you make it executable, you'll just need to run “numbers.py” in the terminal to run the file. To make a file executable, you can use the command “chmod +x numbers.py” in this case. You can use “chmod 755 numbers.py” to give it root permissions or “sudo chmod +x numbers.py” for root executable. Here is some more information about the chmod command.
12. hostname — Use hostname to know your name in your host or network. Basically, it displays your hostname and IP address. Just typing “hostname”gives the output. Typing in “hostname -I” gives you your IP address in your network.
13. ping — Use ping to check your connection to a server. Wikipedia says, "Ping is a computer network administration software utility used to test the reachability of a host on an Internet Protocol (IP) network". Simply, when you type in, for example, “ping google.com”, it checks if it can connect to the server and come back. It measures this round-trip time and gives you the details about it. The use of this command for simple users like us is to check your internet connection. If it pings the Google server (in this case), you can confirm that your internet connection is active!
1. Download Python 2.7.10 source code from python using wget command given below. $ cd ~/Downloads $ wget https://www.python.org/ftp/python/2.7.10/Python-2.7.10.tgz 2. Extract the downloaded package $ tar -zxvf Python-2.7.10.tgz 3. Install build essentials to build python source code $ sudo apt-get install build-essential checkinstall $ sudo apt-get install libreadline-gplv2-dev libncursesw5-dev libgdbm-dev libc6-dev libbz2dev libsqlite3-dev tk-dev libssl-dev 4 . Go to Python-2.7.10 folder, configure and build python source code with make altinstall command as given below. $ cd ~/Downloads/Python-2.7.10/ $ sudo ./configure $ sudo make altinstall Build process takes some time. Once this step is successful, python 2.7.10 installation is successful. 5. To verify Python installation location, use whereis command $ whereis python2.7 6. To verify Python-2.7.10 version, run the command given below. $ python2.7 –version
Python I used the Python version 2.7.11. Python is an easy to learn, powerful programming language. It has efficient high-level data structures and a simple but effective approach to object-oriented programming. Python’s elegant syntax and dynamic typing, together with its interpreted nature, make it an ideal language for scripting and rapid application development in many areas on most platforms. The Python interpreter is easily extended with new functions and data types implemented in C or C++ (or other languages callable from C). Python is also suitable as an extension language for customizable applications. Some Basic uses of Python that I used in my project 1. Reading and Writing Files open () returns a file object, and is most commonly used with two arguments: open(filename, mode). Assuming that a file object called f has already been created. To read a file’s contents, call f.read (size), which reads some quantity of data and returns it as a string Size is an optional numeric argument. f.readline () reads a single line from the file; a newline character (\n) is left at the end of the string, and is only omitted on the last line of the file if the file doesn’t end in a newline. This makes the return value unambiguous; if f.readline () returns an empty string, the end of the file has been reached, while a blank line is represented by '\n', a string containing only a single newline. f.write (string) writes the contents of string to the file, returning none. f.tell () returns an integer giving the file object’s current position in the file, measured in bytes from the beginning of the file. 2. Defining Functions The keyword def introduces a function definition. It must be followed by the function name and the parenthesized list of formal parameters. The statements that form the body of the function start at the next line, and must be indented. 3.for Statements The for statement in Python differs a bit from what you may be used to in C. Rather than always iterating over an arithmetic progression of numbers (like in Pascal), or giving the user the ability to define both the iteration step and halting condition (as C), Python’s for statement iterates over the items of any sequence (a list or a string), in the order that they appear in the sequence.
Pythons Data Science library $ sudo apt-get install pip $ sudo apt-get update $ pip install pandas $pip install matplotlib $ pip install xlsxwriter $ pip install mpld3
What problem does pandas solve? Python has long been great for data munging and preparation, but less so for data analysis and modeling. pandas helps fill this gap, enabling you to carry out your entire data analysis workflow in Python without having to switch to a more domain specific language like R. Combined with the excellent IPython toolkit and other libraries, the environment for doing data analysis in Python excels in performance, productivity, and the ability to collaborate. pandas does not implement significant modeling functionality outside of linear and panel regression; for this, look to statsmodels and scikit-learn. More work is still needed to make Python a first class statistical modeling environment, but we are well on our way toward that goal.
Library Highlights
A fast and efficient DataFrame object for data manipulation with integrated indexing; Tools for reading and writing data between in-memory data structures and different formats: CSV and text files, Microsoft Excel, SQL databases, and the fast HDF5 format; Intelligent data alignment and integrated handling of missing data: gain automatic label-based alignment in computations and easily manipulate messy data into an orderly form; Flexible reshaping and pivoting of data sets; Intelligent label-based slicing, fancy indexing, and subsetting of large data sets; Columns can be inserted and deleted from data structures for size mutability; Aggregating or transforming data with a powerful group by engine allowing split-apply-combine operations on data sets; High performance merging and joining of data sets; Hierarchical axis indexing provides an intuitive way of working with high-dimensional data in a lower-dimensional data structure; Time series-functionality: date range generation and frequency conversion, moving window statistics, moving window linear regressions, date shifting and lagging. Even create domain-specific time offsets and join time series without losing data; Highly optimized for performance, with critical code paths written in Cythonor C. Python with pandas is in use in a wide variety of academic and commercialdomains, including Finance, Neuroscience, Economics, Statistics, Advertising, Web Analytics, and more.
GEOPHONES ANALYSIS
INTRODUCTION:
A geophone is a ground motion transducer that has been used by geophysicists and seismologists to convert ground movement into voltage. Any deviation in this measured voltage from the base line is regarded as seismic response, which is used for analyzing the earth’s structure. Resonance frequency is the key factor in a geophone, and it has to be low for the measurement of low-frequency signals. On the other hand, geophone must exhibit high bandwidth to measure high-frequency signals as well. However, most of the currently available geophones include mechanical springs that decrease the performance of the device. Feeding back the output of geophone can vary the geophone’s sensitivity with respect to its frequency, and as a result low frequency signals are amplified. The resolution of geophone can be enhanced by controlling the position of proof mass. Working Principle A typical geophone consists of a mass suspended by means of mechanical springs. The geophone housing and the suspended mass start moving with the application of a velocity at frequencies lesser than the resonance frequency. The mass will remain stationary for frequencies greater than the resonance frequency. The movement of mass is based on either magnets or coils. The response of a coil/magnet geophone is proportional to the ground velocity.
Applications Geophones are used for several industrial applications for vibration isolation purposes and absolute velocity sensing to achieve a high level of accuracy and precision. Geophone technology is employed to measure absolute velocity for lithographic and high-level inspection applications to determine payload disturbances caused by moving parts and other external disturbances. They are also used to position and control a complex lens system. Detection of leakage in oil and gas fields and earthquake prediction are also other major applications of geophones.
OBJECTIVES: The aim of the project is to visualize the resistance of the geophones and accordingly we can tell the status of the geophones i.e. good or bad. METHODOLOGY:
Input consists of csv files having resistance of each geophone along with some other parameters like tilt etc. Each file represents the data taken for a particular time for a single day. Files are then cleaned and removed the unwanted columns and strings. Processed files are then parsed into Python’s Data Science Library and then they are plotted using Plotly.
TECHNIQUES USED ARE:
Python’s Data Science Library
Pandas
Matplotlib
Plotly
Python’s web framework Django.
Postgres Sql
MODULES:
Admin Authorizations for who is in the groups Authorizations for users
Users Can upload, view and download the files
Groups Authorized the users in the particular group
Using Python's csv module to parse the data We create empty lists to contain the latitudes and longitudes. Then we use the with statement to ensure that the file closes properly once it has been read, even if there are errors in processing the file. With the data file open, we initialize a csv reader object. The next() function skips over the header row. Then we loop through each row in the data file, and pull out the information we want.
USING PYPLOT matplotlib.pyplot is a collection of command style functions that make matplotlib work like MATLAB. Each pyplot function makes some change to a figure: e.g., creates a figure, creates a plotting area in a figure, plots some lines in a plotting area, decorates the plot with labels, etc. In matplotlib.pyplot various states are preserved across function calls, so that it keeps track of things like the current figure and plotting area, and the plotting functions are directed to the current axes (please note that “axes” here and in most places in the documentation refers to the axes part of a figure and not the strict mathematical term for more than one axis). import matplotlib.pyplot as plt plt.plot([1,2,3,4]) plt.ylabel('some numbers') plt.show()
LEARNING DJANGO
•
LEARNING THE USE OF PostgreSQl DATABASE
PostgreSQL is a powerful, open source object-relational database system. PostgreSQL, originally called Postgres, was created at UCB by a computer Science professor named Michael Stonebraker. PostgreSQL supports four standard procedural languages which allow the users to write their own code in any of the languages and it can be executed by PostgreSQL database server. These procedural languages are PL/pgSQL, PL/Tcl, PL/Perl and PL/Python. Besides, other non-standard procedural languages like PL/PHP, PL/V8, PL/Ruby, PL/Java, etc., are also supported. The PostgreSQL can be integrated with Python using psycopg2 module. Psycopg2 is a PostgreSQL database adapter for the Python programming language. Psycopg2 was written with the aim of being very small and fast. $yum install python-psycopg2 Following Python code shows how to connect to an existing database. If database does not exist, then it will be created and finally a database object will be returned. #!/usr/bin/python import psycopg2 conn = psycopg2.connect(database="testdb", user="postgres", password="pass123", host="127.0.0.1", port="5432") print "Opened database successfully. Create a Table Following Python program will be used to create a table in previously created database: #!/usr/bin/python import psycopg2 conn = psycopg2.connect(database="testdb", user="postgres", password="pass123", host="127.0.0.1", port="5432") print "Opened database successfully" cur = conn.cursor() cur.execute('''CREATE TABLE COMPANY (ID INT PRIMARY KEY NOT NULL, NAME TEXT NOT NULL, AGE INT NOT NULL, ADDRESS CHAR(50), SALARY REAL);''') print "Table created successfully"
conn.commit() conn.close() Creating a project If this is your first time using Django, you’ll have to take care of some initial setup. Namely, you’ll need to auto-generate some code that establishes a Django project – a collection of settings for an instance of Django, including database configuration, Django-specific options and application-specific settings. From the command line, cd into a directory where you’d like to store your code, then run the following command: $ django-admin startproject mysite This will create a mysite directory in your current directory. Let’s look at what startproject created:
mysite/ manage.py mysite/ __init__.py settings.py urls.py wsgi.py These files are: The outer mysite/ root directory is just a container for your project. Its name doesn’t matter to Django; you can rename it to anything you like. manage.py: A command-line utility that lets you interact with this Django project in various ways. You can read all the details about manage.py in django-admin and manage.py. The inner mysite/ directory is the actual Python package for your project. Its name is the Python package name you’ll need to use to import anything inside it (e.g. mysite.urls). mysite/__init__.py: An empty file that tells Python that this directory should be considered a Python package mysite/settings.py: Settings/configuration for this Django project. Django settings will tell you all about how settings work. mysite/urls.py: The URL declarations for this Django project; a “table of contents” of your Django-powered site. mysite/wsgi.py: An entry-point for WSGI-compatible web servers to serve your project. The development server $ python manage.py runserver You’ve started the Django development server, a lightweight Web server written purely in Python.
LEARNING OTHER THINGS APART FROM MY PROJECT LEARNING GITHUB GitHub is a web-based Git repository hosting service. It offers all of the distributed revision control and source code management (SCM) functionality of Git as well as adding its own features. Unlike Git, which is strictly a commandline tool, GitHub provides a Web-based graphical interface and desktop as well as mobile integration. It also provides access control and several collaboration features such as bug tracking, feature requests, task management, and wikis for every project. Now what is Git? Git is a version control system that is used for software development and other version control tasks. As a distributed revision control system it is aimed at speed, data integrity, and support for distributed, non-linear workflows.Git was created by Linus Torvalds in 2005 for development of the Linux kernel, with other kernel developers contributing to its initial development. Again what is a version control system? A component of software configuration management, version control, also known as revision control or source control,[1] is the management of changes to documents, computer programs, large web sites, and other collections of information. The need for a logical way to organize and control revisions has existed for almost as long as writing has existed, but revision control became much more important, and complicated, when the era of computing began. The numbering of book editions and of specification revisions are examples that date back to the print-only era. Today, the most capable (as well as complex) revision control systems are those used in software development, where a team of people may change the same files. Version control systems (VCS) most commonly run as stand-alone applications, but revision control is also embedded in various types of software such as word processors and spreadsheets, e.g., Google Docs and Sheets[2] and in various content management systems, e.g., Wikipedia's Page history. Revision control allows for the ability to revert a document to a previous revision, which is critical for allowing editors to track each other's edits, correct mistakes, and defend against vandalism and spamming.
Awk AWK is an interpreted programming language designed for text processing and typically used as a data extraction and reporting tool. It is a standard feature of most Unix-like operating systems. The AWK language is a data-driven scripting language consisting of a set of actions to be taken against streams of textual data – either run directly on files or used as part of a pipeline – for purposes of extracting or transforming text, such as
producing formatted reports. The language extensively uses the string datatype, associative arrays (that is, arrays indexed by key strings), and regular expressions. While AWK has a limited intended application domain and was especially designed to support one-liner programs, the language is Turing-complete, and even the early Bell Labs users of AWK often wrote well-structured large AWK programs. AWK was created at Bell Labs in the 1970s,and its name is derived from the surnames of its authors – Alfred Aho, Peter Weinberger, and Brian Kernighan. AWK is a language for processing text files. A file is treated as a sequence of records, and by default each line is a record. Each line is broken up into a sequence of fields, so we can think of the first word in a line as the first field, the second word as the second field, and so on. An AWK program is a sequence of pattern-action statements. AWK reads the input a line at a time. A line is scanned for each pattern in the program, and for each pattern that matches, the associated action is executed. An AWK program is a series of pattern action pairs, written as: condition { action } where condition is typically an expression and action is a series of commands. The input is split into records, where by default records are separated by newline characters so that the input is split into lines. The program tests each record against each of the conditions in turn, and executes the action for each expression that is true. Either the condition or the action may be omitted. The condition defaults to matching every record. The default action is to print the record.
AWK commands AWK commands are the statements that are substituted for action in the examples above. AWK commands can include function calls, variable assignments, calculations, or any combination thereof. AWK contains built-in support for many functions; many more are provided by the various flavors of AWK. Also, some flavors support the inclusion of dynamically linked libraries, which can also provide more functions. The print command The print command is used to output text. The output text is always terminated with a predefined string called the output record separator (ORS) whose default value is a newline. The simplest form of this command is: print This displays the contents of the current record. In AWK, records are broken down into fields, and these can be displayed separately: print $1 Displays the first field of the current record print $1, $3
Displays the first and third fields of the current record, separated by a predefined string called the output field separator (OFS) whose default value is a single space character. Sample applications BEGIN { print "Hello, world!" } Print lines longer than 80 characters length($0) > 80 Print a count of words { w += NF c += length + 1 } END { print NR, w, c }