{"cells":[{"cell_type":"markdown","id":"2c77954e","metadata":{"id":"2c77954e"},"source":["## Cómo leer datos en Python\n","\n","Nombres del equipo:\n","- Estudiante 1\n","- Estudiante 2\n","\n","\n","A continuación vas a leer y explorar el conjunto de datos de los Resultados Pruebas Saber Colegios de Girón, en el departamento de Santander.\n","\n","Antes de continuar, lee el código de la sección 1 y predice lo que hace. Escribe tus predicciones:\n","\n","- pd.read_csv(): Aquí va tu respuesta\n","- .head(10) :\n","- .tail(10):\n","- .info():\n","- .shape:\n","\n"]},{"cell_type":"markdown","id":"fa6d3407","metadata":{"id":"fa6d3407"},"source":["¿Qué tipos de preguntas podrías hacerte frente a los datos de las pruebas saber? Escribe por lo menos una pregunta.\n","\n","Pregunta:\n"]},{"cell_type":"markdown","id":"83200948","metadata":{"id":"83200948"},"source":["### Sección 1\n","Es momento de ejecutar las celdas de código. Observa y reemplaza los comentarios con una descripción de lo que hace cada función."]},{"cell_type":"code","execution_count":null,"id":"2431ae80","metadata":{"id":"2431ae80"},"outputs":[],"source":["# Importamos el paquete pandas y le asignamos un nombre corto: pd\n","import pandas as pd"]},{"cell_type":"code","execution_count":null,"id":"9be350fe","metadata":{"colab":{"base_uri":"https://localhost:8080/","height":354},"id":"9be350fe","executionInfo":{"status":"error","timestamp":1766454913649,"user_tz":300,"elapsed":128,"user":{"displayName":"Lorena de Varón","userId":"13243936125771519584"}},"outputId":"a9e2a29d-f91f-435b-fe3c-2c7e4e6d174b"},"outputs":[{"output_type":"error","ename":"FileNotFoundError","evalue":"[Errno 2] No such file or directory: 'pruebas_saber.csv'","traceback":["\u001b[0;31m---------------------------------------------------------------------------\u001b[0m","\u001b[0;31mFileNotFoundError\u001b[0m Traceback (most recent call last)","\u001b[0;32m/tmp/ipython-input-294788846.py\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;31m# Escribe tu comentario\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0msaber\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mpd\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mread_csv\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"pruebas_saber.csv\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m","\u001b[0;32m/usr/local/lib/python3.12/dist-packages/pandas/io/parsers/readers.py\u001b[0m in \u001b[0;36mread_csv\u001b[0;34m(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, date_format, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options, dtype_backend)\u001b[0m\n\u001b[1;32m 1024\u001b[0m \u001b[0mkwds\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mupdate\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mkwds_defaults\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1025\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1026\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0m_read\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfilepath_or_buffer\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mkwds\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 1027\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1028\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n","\u001b[0;32m/usr/local/lib/python3.12/dist-packages/pandas/io/parsers/readers.py\u001b[0m in \u001b[0;36m_read\u001b[0;34m(filepath_or_buffer, kwds)\u001b[0m\n\u001b[1;32m 618\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 619\u001b[0m \u001b[0;31m# Create the parser.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 620\u001b[0;31m \u001b[0mparser\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mTextFileReader\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfilepath_or_buffer\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwds\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 621\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 622\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mchunksize\u001b[0m \u001b[0;32mor\u001b[0m \u001b[0miterator\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n","\u001b[0;32m/usr/local/lib/python3.12/dist-packages/pandas/io/parsers/readers.py\u001b[0m in \u001b[0;36m__init__\u001b[0;34m(self, f, engine, **kwds)\u001b[0m\n\u001b[1;32m 1618\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1619\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mhandles\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mIOHandles\u001b[0m \u001b[0;34m|\u001b[0m \u001b[0;32mNone\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1620\u001b[0;31m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_engine\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_make_engine\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mf\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mengine\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 1621\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1622\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mclose\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m->\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n","\u001b[0;32m/usr/local/lib/python3.12/dist-packages/pandas/io/parsers/readers.py\u001b[0m in \u001b[0;36m_make_engine\u001b[0;34m(self, f, engine)\u001b[0m\n\u001b[1;32m 1878\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0;34m\"b\"\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mmode\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1879\u001b[0m \u001b[0mmode\u001b[0m \u001b[0;34m+=\u001b[0m \u001b[0;34m\"b\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1880\u001b[0;31m self.handles = get_handle(\n\u001b[0m\u001b[1;32m 1881\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1882\u001b[0m \u001b[0mmode\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n","\u001b[0;32m/usr/local/lib/python3.12/dist-packages/pandas/io/common.py\u001b[0m in \u001b[0;36mget_handle\u001b[0;34m(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)\u001b[0m\n\u001b[1;32m 871\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mioargs\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mencoding\u001b[0m \u001b[0;32mand\u001b[0m \u001b[0;34m\"b\"\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mioargs\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmode\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 872\u001b[0m \u001b[0;31m# Encoding\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 873\u001b[0;31m handle = open(\n\u001b[0m\u001b[1;32m 874\u001b[0m \u001b[0mhandle\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 875\u001b[0m \u001b[0mioargs\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmode\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n","\u001b[0;31mFileNotFoundError\u001b[0m: [Errno 2] No such file or directory: 'pruebas_saber.csv'"]}],"source":["# Escribe tu comentario\n","saber = pd.read_csv(\"pruebas_saber.csv\")"]},{"cell_type":"code","execution_count":null,"id":"9fadf983","metadata":{"id":"9fadf983"},"outputs":[],"source":["# Escribe tu comentario\n","print(type(saber))"]},{"cell_type":"code","execution_count":null,"id":"89b1beea","metadata":{"id":"89b1beea"},"outputs":[],"source":["# Escribe tu comentario\n","saber.head(10)\n","\n","# Modifica el número al interior de los paréntesis. ¿Qué sucede?\n","# Escribe tu respuesta"]},{"cell_type":"code","execution_count":null,"id":"3faa381e","metadata":{"id":"3faa381e"},"outputs":[],"source":["# Escribe tu comentario\n","saber.tail(10)"]},{"cell_type":"code","execution_count":null,"id":"af5a35ec","metadata":{"id":"af5a35ec"},"outputs":[],"source":["# Escribe tu comentario\n","saber.columns"]},{"cell_type":"markdown","id":"e0c23634","metadata":{"id":"e0c23634"},"source":["### Sección 2\n","A continuación aprenderás dos nuevas funciones que te permitirán **responder preguntas acerca de tus datos**\n","\n","Responde cada una de las siguientes preguntas y escribe la función que utilizaste para responderla. Por ejemplo:\n","\n","\n","- ¿Cuántas columnas tiene el dataframe?\n"," - Respuesta: El dataframe tiene 6 columnas (función .shape)\n","\n","- ¿Cuántas filas tiene el dataframe?\n"," - Respuesta:\n","\n","- ¿Cuál fue el puntaje promedio obtenido por los colegios de Girón en las pruebas Saber?\n"," - Respuesta:\n","\n","- ¿Cuál fue el puntaje máximo?\n"," - Respuesta:\n","\n","- ¿Cuántos colegios tienen jornada única?\n"," - Respuesta:"]},{"cell_type":"code","execution_count":null,"id":"8ba2c655","metadata":{"id":"8ba2c655"},"outputs":[],"source":["# ¿Qué tipo de información te ofrece esta función?\n","saber.shape"]},{"cell_type":"code","execution_count":null,"id":"8cf671a4","metadata":{"id":"8cf671a4"},"outputs":[],"source":["# ¿Qué tipo de información te ofrece esta función?\n","saber.describe()"]},{"cell_type":"code","execution_count":null,"id":"a212d22f","metadata":{"id":"a212d22f"},"outputs":[],"source":["# ¿Qué crees que hace el siguiente código?\n","jornadas = saber[\"JORNADA\"]\n","print(jornadas)"]},{"cell_type":"code","execution_count":null,"id":"37b1e610","metadata":{"id":"37b1e610"},"outputs":[],"source":["# ¿Qué tipo de información te ofrece esta función?\n","jornadas.value_counts()"]},{"cell_type":"markdown","id":"f524881f","metadata":{"id":"f524881f"},"source":["### Sección 3: Ahora tú\n","Sigue las instrucciones de la guía para encontrar y descargar tu propio conjunto de datos.\n","\n","1. importa pandas y lee tu csv\n","2. Explora tu dataframe utilizando las funciones ```head()```, ```shape()```, y ```describe()```\n","3. Guarda e imprime una de las columnas de tu dataframe en una nueva variable."]}],"metadata":{"kernelspec":{"display_name":"Python 3 (ipykernel)","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.9.12"},"colab":{"provenance":[]}},"nbformat":4,"nbformat_minor":5}