Comparative Analysis of Alpha Diversity Kelp Forest MPA vs non MPA

Data Description

Data source:

It is a kelp forest data from the Partnership for Interdisciplinary Studies of Coastal Oceans (PISCO) (https://www.piscoweb.org/) doi:https://doi.org/10.6085/AA/pisco_subtidal.161.2

Describe your data, including variables and data types.

1_PISCO_Fish_bysite.csv includes:
categorical variables: SITE, MPAGroup, site_designation, site_status, MLPA_REGN, YEAR_MPA, year.
numeric variables: LATITUDE, LONGITUDE, fishtrans, fish_XXXX (average biomass across transects per site per year for different fish species)

2_PISCO_SwathInvert_bysite.csv includes
categorical variables: SITE, MPAGroup, site_designation, site_status, MLPA_REGN, YEAR_MPA, year.
numeric variables: LATITUDE, LONGITUDE, fishtrans, swath_XXXX (average biomass across transects per site per year for different invertebrates species)

Research question:

How does species biodiversity differ between Marine protected areas (MPA) and non Marine protected areas (reference)?

Data Cleaning

Joined the fish data and invertebrates data together, limit the variables to the ones I need and limit so there is only data in California kelp forest, and drop NAs.
I then added columns that calculate the alpha diversity using different metrics and changed the data format so it is easier to use in the following data visualizations.
I also grouped the clean data and summarize it across year using only species number.

Data Visualization

What do you want your final visualizations to look like?

Boxplots for MPAs and non MPAs, measuring species diversity using different metrics.

What do you want to highlight on your final visualizations in order to answer your research questions? How do you plan to do that?

I want to highlight if the differences observed is significant or not. I plan to include astrids that represents the p-value.

What is missing from your data or would need to change in your data to create these visualizations?

The data are in two separate dataset. It includes the average biomass across transects per site per year for each species, but not species diversity calculated using different metrics for each sites per year.

boxplot comparing alpha diversity in MPAs and in non MPAs (reference) using (i) total biomass (ii) Inverse Simpson diversity index (iv) Observed and (iv) Shannon diversity index. MPA has higher diversity across all metrics

Figure 1. Alpha diversity analysis of invertebrates and fish species among Marine Protected Areas (MPA) and non MPAs. Alpha diversity was measured using (i) total biomass (ii) Inverse Simpson diversity index (iv) Observed and (iv) Shannon diversity index. Statistical significance was determined using the Wilcoxon signed rank test (p-values: ns = not significant, * = 0.05, ** = 0.01)

Alpha diversity of invertebrates and fish species among Marine Protected Areas (MPA) and non MPAs are significantly different across all metrics. MPA have higher species diversity compared to non MPAs.

Line graph comparing species count across site in each year in MPAs and non MPAs(reference). MPAs have higher species count in all years

Figure2. Line graph comparing species count across site in each year in MPAs and non MPAs(reference). MPAs have higher species count in all years

MPA has higher number of species of fish and invertebrates compared to non MPAs across all years.