Copyright and Your Dissertation

Hybrid: Doe 223 or Zoom

This workshop will provide you with practical guidance for navigating copyright questions and other legal considerations for your dissertation or thesis. Whether you’re just starting to write or you’re getting ready to file, you can use our tips and workflow to figure out what you can use, what rights you have as an author, and what it means to share your dissertation online. Register here.    

Python Text Analysis Fundamentals: Part 2

Online via Zoom

This two-part workshop series will prepare participants to move forward with research that uses text analysis, with a special focus on humanities and social science applications. Part 1: Preprocessing Text.  How do we standardize and clean text documents? Text data is noisy, and we often need to develop a pipeline in order to standardize the data, to better facilitate computational modeling. In the first part of this workshop, we walk through possible steps in this pipeline using tools from basic Python, NLTK, and spaCy in order to preprocess and tokenize our data. Part 2: Bag-of-words Representations How do we convert text into a representation that we can operate on computationally? This requires developing a numerical representation of the text. In this part of the workshop, we study one of the foundational numerical representation of text data: the bag-of-words model. This model relies heavily on word frequencies in order to characterize text corpora. We build bag-of-words models, and their variations (e.g., TF-IDF), and use these representations to perform classification on text. To continue with Text Analysis sign up for Topic Modeling or Word Embeddings.  Part 3: Topic Modeling. How do we identify topics within a corpus of documents? In this part, we study unsupervised learning of text data. Specifically, we use topic models such as Latent Dirichlet Allocation and Non-negative Matrix Factorization to construct “topics” in text from the statistical regularities in the data. Part 4: Word Embeddings How can we use neural networks to create meaningful representations of words? The bag-of-words is limited in its ability to characterize text, because it does not utilize word context. In this part, we study word embeddings, which were among the first attempts to use neural networks to develop numerical representations of text that incorporate context. We learn how to use the package gensim to construct and explore word embeddings of text. The first two parts are taught as a joint series. Parts 3 and 4 can be attended "a la carte"; however, prior knowledge of Parts 1 and 2 is assumed. Prerequisites: D-Lab’s Python Fundamentals introductory series or equivalent knowledge. Workshop Materials: https://github.com/dlab-berkeley/Python-Text-Analysis Software Requirements:Installation Instructions for Python Anaconda

First Generation Alumni in Helping Professions

Blue & Gold Room 2440 Bancroft Way, Berkeley

Career Connections are co-sponsored by the Cal Alumni Association, these events are your opportunity to meet with alumni and professionals representing a variety of roles within a career field. Gain insider knowledge about hiring practices and create meaningful connections that could lead to a future internship or job! Come explore the variety of career and internship possibilities found in the world of helping professions with first generation (first in family to college) alumni working in various roles. Helping professions are occupations that provide health and education services to individuals and groups, including the fields of psychology, psychiatry, counseling, medicine, nursing, social work, physical and occupational therapy, teaching, and education. Students of all majors are welcome! Light refreshments will be provided. Featured speakers will be posted closer to the event.