{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction\n", "\n", "Active _cis_-regulatory sequences in the genome are characterized by accessible chromatin and specific histone modifications, which reflect the action of DNA-binding transcription factors (TFs) that recognize specific sequence motifs and recruit chromatin-modifying enzymes [@bib44]. These epigenetic hallmarks of active chromatin are routinely used to train machine learning models that predict _cis_-regulatory sequences, based on the assumption that such epigenetic marks are reliable predictors of genuine _cis_-regulatory sequences [@bib13; @bib19; @bib27; @bib41; @bib50; @bib77; @bib90]. However, results from functional assays show that many predicted _cis_-regulatory sequences exhibit little or no _cis_-regulatory activity. Typically, 50% or more of predicted _cis_-regulatory sequences fail to drive expression in massively parallel reporter assays (MPRAs) [@bib58; @bib48], indicating that an active chromatin state is not sufficient to reliably identify _cis_-regulatory sequences.\n", "\n", "Another challenge is that enhancers and silencers are difficult to distinguish by chromatin accessibility or epigenetic state [@bib11; @bib20; @bib62; @bib66; @bib76], and thus computational predictions of _cis-_regulatory sequences often do not differentiate between enhancers and silencers. Silencers are often enhancers in other cell types [@bib5; @bib11; @bib20; @bib30; @bib37; @bib61; @bib62], reside in open chromatin [@bib11; @bib29; @bib30; @bib62], sometimes bear epigenetic marks of active enhancers [@bib14; @bib30], and can be bound by TFs that also act on enhancers in the same cell type [@bib1; @bib21; @bib30; @bib35; @bib37; @bib52; @bib53; @bib65; @bib69; @bib70; @bib80; @bib85]. As a result, enhancers and silencers share similar sequence features, and understanding how they are distinguished in a particular cell type remains an important challenge [@bib76].\n", "\n", "The TF cone-rod homeobox (CRX) controls selective gene expression in a number of different photoreceptor and bipolar cell types in the retina [@bib6; @bib17; @bib18; @bib60]. These cell types derive from the same progenitor cell population [@bib45; @bib83], but they exhibit divergent, CRX-directed transcriptional programs [@bib9; @bib25; @bib31; @bib60]. CRX cooperates with cell type-specific co-factors to selectively activate and repress different genes in different cell types and is required for differentiation of rod and cone photoreceptors [@bib7; @bib23; @bib25; @bib28; @bib34; @bib43; @bib51; @bib55; @bib56; @bib60; @bib65; @bib75; @bib79]. However, the sequence features that define CRX-targeted enhancers vs. silencers in the retina are largely unknown.\n", "\n", "We previously found that a significant minority of CRX-bound sequences act as silencers in an MPRA conducted in live mouse retinas [@bib85], and that silencer activity requires CRX [@bib86]. Here, we extend our analysis by testing thousands of additional candidate _cis_-regulatory sequences. We show that while regions of accessible chromatin and CRX binding exhibit a range of _cis_-regulatory activity, enhancers and silencers contain more TF motifs than inactive sequences, and that enhancers are distinguished from silencers by a higher diversity of TF motifs. We capture the differences between these sequence classes with a new metric, motif information content (Boltzmann entropy), that considers only the number and diversity of TF motifs in a candidate _cis_-regulatory sequence. Our results suggest that CRX-targeted enhancers are defined by a flexible regulatory grammar and demonstrate how differences in motif information content encode functional differences between genomic loci with similar chromatin states." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# Setup imports for analysis\n", "import os\n", "import sys\n", "import itertools\n", "\n", "import numpy as np\n", "import pandas as pd\n", "import matplotlib as mpl\n", "import matplotlib.pyplot as plt\n", "import matplotlib.patches as mpatches\n", "from mpl_toolkits.axes_grid1 import make_axes_locatable\n", "from scipy import stats\n", "from sklearn.feature_selection import RFE, RFECV\n", "from sklearn.linear_model import LogisticRegression\n", "from sklearn.model_selection import StratifiedKFold\n", "from pybedtools import BedTool\n", "from IPython.display import display\n", "import logomaker\n", "\n", "sys.path.insert(0, \"utils\")\n", "from utils import fasta_seq_parse_manip, gkmsvm, modeling, plot_utils, predicted_occupancy, quality_control, sequence_annotation_processing\n", "\n", "data_dir = os.path.join(\"Data\")\n", "figures_dir = os.path.join(\"Figures\")\n", "\n", "# Load in all sequences\n", "all_seqs = fasta_seq_parse_manip.read_fasta(os.path.join(data_dir, \"library1And2.fasta\"))\n", "# Drop scrambled sequences -- we don't need them for any analysis\n", "all_seqs = all_seqs[~(all_seqs.index.str.contains(\"scr\"))]" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "plot_utils.set_manuscript_params()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Results\n", "\n", "We tested the activities of 4844 putative CRX-targeted _cis_-regulatory sequences (CRX-targeted sequences) by MPRA in live retinas. The MPRA libraries consist of 164 bp genomic sequences centered on the best match to the CRX position weight matrix (PWM) [@bib49] whenever a CRX motif is present, and matched sequences in which all CRX motifs were abolished by point mutation (Materials and methods). The MPRA libraries include 3299 CRX-bound sequences identified by ChIP-seq in the adult retina [@bib9] and 1545 sequences that do not have measurable CRX binding in the adult retina but reside in accessible chromatin in adult photoreceptors [@bib31] and have the H3K27ac enhancer mark in postnatal day 14 (P14) retina [@bib72] (‘ATAC-seq peaks’). We split the sequences across two plasmid libraries, each of which contained the same 150 scrambled sequences as internal controls ([Supplementary files 1 and 2](#supp1)). We cloned sequences upstream of the rod photoreceptor-specific _Rhodopsin_ (_Rho_) promoter and a _DsRed_ reporter gene, electroporated libraries into explanted mouse retinas at P0 in triplicate, harvested the retinas at P8, and then sequenced the RNA and input DNA plasmid pool. The data is highly reproducible across replicates (R^2^ > 0.96, [Figure 1—figure supplement 1](#fig1s1)). After activity scores were calculated and normalized to the basal _Rho_ promoter, the two libraries were well calibrated and merged together (two-sample Kolmogorov-Smirnov test p = 0.09, [Figure 1—figure supplement 2](#fig1s2), [Supplementary file 3](#supp3), and Materials and methods)." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Processing data for library1 with the Rho promoter...\n", "Reading in barcode counts.\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>label</th>\n", " <th>DNA</th>\n", " <th>RNA1</th>\n", " <th>RNA2</th>\n", " <th>RNA3</th>\n", " </tr>\n", " <tr>\n", " <th>barcode</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>AACAACAAG</th>\n", " <td>chr16-87432635-87432799_CPPQ_scrambled</td>\n", " <td>3019</td>\n", " <td>148</td>\n", " <td>325</td>\n", " <td>97</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACCGC</th>\n", " <td>chr4-119112319-119112483_CPPE_WT</td>\n", " <td>4117</td>\n", " <td>24493</td>\n", " <td>25950</td>\n", " <td>23406</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACGGG</th>\n", " <td>chr7-128854234-128854398_UPCE_WT</td>\n", " <td>86</td>\n", " <td>76</td>\n", " <td>39</td>\n", " <td>233</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACTAC</th>\n", " <td>chr4-138107597-138107761_UPPE_WT</td>\n", " <td>827</td>\n", " <td>926</td>\n", " <td>857</td>\n", " <td>659</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACTGT</th>\n", " <td>chr5-31298508-31298672_CPPE_WT</td>\n", " <td>7170</td>\n", " <td>492</td>\n", " <td>392</td>\n", " <td>149</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " label DNA RNA1 RNA2 RNA3\n", "barcode \n", "AACAACAAG chr16-87432635-87432799_CPPQ_scrambled 3019 148 325 97\n", "AACAACCGC chr4-119112319-119112483_CPPE_WT 4117 24493 25950 23406\n", "AACAACGGG chr7-128854234-128854398_UPCE_WT 86 76 39 233\n", "AACAACTAC chr4-138107597-138107761_UPPE_WT 827 926 857 659\n", "AACAACTGT chr5-31298508-31298672_CPPE_WT 7170 492 392 149" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Removing detection-limited barcodes and normalizing to counts per million.\n", "Barcodes missing in DNA:\n", "Sample DNA: 1090 barcodes\n", "1090 barcodes are missing from more than 0 DNA samples.\n", "Barcodes off in RNA:\n", "Sample RNA1: 1744 barcodes\n", "Sample RNA2: 1913 barcodes\n", "Sample RNA3: 1491 barcodes\n", "2215 barcodes are off in more than 0 RNA samples.\n", "There are a total of 157.151 million barcode counts.\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>label</th>\n", " <th>DNA</th>\n", " <th>RNA1</th>\n", " <th>RNA2</th>\n", " <th>RNA3</th>\n", " </tr>\n", " <tr>\n", " <th>barcode</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>AACAACAAG</th>\n", " <td>chr16-87432635-87432799_CPPQ_scrambled</td>\n", " <td>73.436588</td>\n", " <td>4.307406</td>\n", " <td>7.418047</td>\n", " <td>2.561422</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACCGC</th>\n", " <td>chr4-119112319-119112483_CPPE_WT</td>\n", " <td>100.145224</td>\n", " <td>712.846538</td>\n", " <td>592.302519</td>\n", " <td>618.068596</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACGGG</th>\n", " <td>chr7-128854234-128854398_UPCE_WT</td>\n", " <td>2.091933</td>\n", " <td>2.211911</td>\n", " <td>0.890166</td>\n", " <td>6.152695</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACTAC</th>\n", " <td>chr4-138107597-138107761_UPPE_WT</td>\n", " <td>20.116614</td>\n", " <td>26.950390</td>\n", " <td>19.560819</td>\n", " <td>17.401829</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACTGT</th>\n", " <td>chr5-31298508-31298672_CPPE_WT</td>\n", " <td>174.408855</td>\n", " <td>14.319214</td>\n", " <td>8.947306</td>\n", " <td>3.934556</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " label DNA RNA1 \\\n", "barcode \n", "AACAACAAG chr16-87432635-87432799_CPPQ_scrambled 73.436588 4.307406 \n", "AACAACCGC chr4-119112319-119112483_CPPE_WT 100.145224 712.846538 \n", "AACAACGGG chr7-128854234-128854398_UPCE_WT 2.091933 2.211911 \n", "AACAACTAC chr4-138107597-138107761_UPPE_WT 20.116614 26.950390 \n", "AACAACTGT chr5-31298508-31298672_CPPE_WT 174.408855 14.319214 \n", "\n", " RNA2 RNA3 \n", "barcode \n", "AACAACAAG 7.418047 2.561422 \n", "AACAACCGC 592.302519 618.068596 \n", "AACAACGGG 0.890166 6.152695 \n", "AACAACTAC 19.560819 17.401829 \n", "AACAACTGT 8.947306 3.934556 " ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Normalizing RNA to DNA.\n", "Averaging across barcodes within a replicate.\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>RNA1</th>\n", " <th>RNA2</th>\n", " <th>RNA3</th>\n", " </tr>\n", " <tr>\n", " <th>label</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>BASAL</th>\n", " <td>0.331679</td>\n", " <td>0.306512</td>\n", " <td>0.277308</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-104768570-104768734_UPCQ_MUT-allCrxSites</th>\n", " <td>1.005172</td>\n", " <td>0.826315</td>\n", " <td>0.930872</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-104768570-104768734_UPCQ_WT</th>\n", " <td>1.114088</td>\n", " <td>1.080287</td>\n", " <td>1.091619</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-106008207-106008371_CPPE_MUT-allCrxSites</th>\n", " <td>1.180305</td>\n", " <td>1.094909</td>\n", " <td>0.798394</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-106008207-106008371_CPPE_WT</th>\n", " <td>0.441799</td>\n", " <td>0.533383</td>\n", " <td>0.868990</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " RNA1 RNA2 RNA3\n", "label \n", "BASAL 0.331679 0.306512 0.277308\n", "chr1-104768570-104768734_UPCQ_MUT-allCrxSites 1.005172 0.826315 0.930872\n", "chr1-104768570-104768734_UPCQ_WT 1.114088 1.080287 1.091619\n", "chr1-106008207-106008371_CPPE_MUT-allCrxSites 1.180305 1.094909 0.798394\n", "chr1-106008207-106008371_CPPE_WT 0.441799 0.533383 0.868990" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Normalizing to the basal Rho promoter.\n", "Computing p-values for the null hypothesis that a sequence is no different than the basal promoter alone.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/home/ryan/Documents/DBBS/CohenLab/Manuscripts/CRX-Information-Content/utils/quality_control.py:408: RuntimeWarning: invalid value encountered in double_scalars\n", " cov = std / mean\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Done processing data!\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>expression</th>\n", " <th>expression_std</th>\n", " <th>expression_reps</th>\n", " <th>expression_pvalue</th>\n", " <th>expression_qvalue</th>\n", " </tr>\n", " <tr>\n", " <th>label</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>chr1-104768570-104768734_UPCQ_MUT-allCrxSites</th>\n", " <td>3.027744</td>\n", " <td>0.330482</td>\n", " <td>3.0</td>\n", " <td>0.000139</td>\n", " <td>0.000749</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-104768570-104768734_UPCQ_WT</th>\n", " <td>3.606621</td>\n", " <td>0.297412</td>\n", " <td>3.0</td>\n", " <td>0.001206</td>\n", " <td>0.003548</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-106008207-106008371_CPPE_MUT-allCrxSites</th>\n", " <td>3.336604</td>\n", " <td>0.396284</td>\n", " <td>3.0</td>\n", " <td>0.003039</td>\n", " <td>0.007388</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-106008207-106008371_CPPE_WT</th>\n", " <td>2.068611</td>\n", " <td>0.944664</td>\n", " <td>3.0</td>\n", " <td>0.080583</td>\n", " <td>0.103242</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-106171416-106171580_CSPE_scrambled</th>\n", " <td>1.439587</td>\n", " <td>0.579277</td>\n", " <td>3.0</td>\n", " <td>0.279730</td>\n", " <td>0.312931</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " expression expression_std \\\n", "label \n", "chr1-104768570-104768734_UPCQ_MUT-allCrxSites 3.027744 0.330482 \n", "chr1-104768570-104768734_UPCQ_WT 3.606621 0.297412 \n", "chr1-106008207-106008371_CPPE_MUT-allCrxSites 3.336604 0.396284 \n", "chr1-106008207-106008371_CPPE_WT 2.068611 0.944664 \n", "chr1-106171416-106171580_CSPE_scrambled 1.439587 0.579277 \n", "\n", " expression_reps \\\n", "label \n", "chr1-104768570-104768734_UPCQ_MUT-allCrxSites 3.0 \n", "chr1-104768570-104768734_UPCQ_WT 3.0 \n", "chr1-106008207-106008371_CPPE_MUT-allCrxSites 3.0 \n", "chr1-106008207-106008371_CPPE_WT 3.0 \n", "chr1-106171416-106171580_CSPE_scrambled 3.0 \n", "\n", " expression_pvalue \\\n", "label \n", "chr1-104768570-104768734_UPCQ_MUT-allCrxSites 0.000139 \n", "chr1-104768570-104768734_UPCQ_WT 0.001206 \n", "chr1-106008207-106008371_CPPE_MUT-allCrxSites 0.003039 \n", "chr1-106008207-106008371_CPPE_WT 0.080583 \n", "chr1-106171416-106171580_CSPE_scrambled 0.279730 \n", "\n", " expression_qvalue \n", "label \n", "chr1-104768570-104768734_UPCQ_MUT-allCrxSites 0.000749 \n", "chr1-104768570-104768734_UPCQ_WT 0.003548 \n", "chr1-106008207-106008371_CPPE_MUT-allCrxSites 0.007388 \n", "chr1-106008207-106008371_CPPE_WT 0.103242 \n", "chr1-106171416-106171580_CSPE_scrambled 0.312931 " ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Processing data for library2 with the Rho promoter...\n", "Reading in barcode counts.\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>label</th>\n", " <th>DNA</th>\n", " <th>RNA1</th>\n", " <th>RNA2</th>\n", " <th>RNA3</th>\n", " </tr>\n", " <tr>\n", " <th>barcode</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>AACAACAAG</th>\n", " <td>chr7-141291911-141292075_UPPP_MUT-allCrxSites</td>\n", " <td>132</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACGTT</th>\n", " <td>chr19-16380352-16380516_CPPN_MUT-allCrxSites</td>\n", " <td>1779</td>\n", " <td>36</td>\n", " <td>17</td>\n", " <td>46</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACTAC</th>\n", " <td>chr1-44147572-44147736_UPPP_MUT-allCrxSites</td>\n", " <td>2928</td>\n", " <td>433</td>\n", " <td>802</td>\n", " <td>510</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACTCG</th>\n", " <td>chr12-116230818-116230982_CPPE_WT</td>\n", " <td>2822</td>\n", " <td>3043</td>\n", " <td>2967</td>\n", " <td>3013</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACTGT</th>\n", " <td>chr5-65391346-65391510_CPPP_MUT-allCrxSites</td>\n", " <td>1810</td>\n", " <td>1572</td>\n", " <td>2281</td>\n", " <td>1559</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " label DNA RNA1 RNA2 \\\n", "barcode \n", "AACAACAAG chr7-141291911-141292075_UPPP_MUT-allCrxSites 132 0 1 \n", "AACAACGTT chr19-16380352-16380516_CPPN_MUT-allCrxSites 1779 36 17 \n", "AACAACTAC chr1-44147572-44147736_UPPP_MUT-allCrxSites 2928 433 802 \n", "AACAACTCG chr12-116230818-116230982_CPPE_WT 2822 3043 2967 \n", "AACAACTGT chr5-65391346-65391510_CPPP_MUT-allCrxSites 1810 1572 2281 \n", "\n", " RNA3 \n", "barcode \n", "AACAACAAG 1 \n", "AACAACGTT 46 \n", "AACAACTAC 510 \n", "AACAACTCG 3013 \n", "AACAACTGT 1559 " ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Removing detection-limited barcodes and normalizing to counts per million.\n", "Barcodes missing in DNA:\n", "Sample DNA: 277 barcodes\n", "277 barcodes are missing from more than 0 DNA samples.\n", "Barcodes off in RNA:\n", "Sample RNA1: 875 barcodes\n", "Sample RNA2: 678 barcodes\n", "Sample RNA3: 774 barcodes\n", "1180 barcodes are off in more than 0 RNA samples.\n", "There are a total of 157.724 million barcode counts.\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>label</th>\n", " <th>DNA</th>\n", " <th>RNA1</th>\n", " <th>RNA2</th>\n", " <th>RNA3</th>\n", " </tr>\n", " <tr>\n", " <th>barcode</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>AACAACAAG</th>\n", " <td>chr7-141291911-141292075_UPPP_MUT-allCrxSites</td>\n", " <td>3.144868</td>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACGTT</th>\n", " <td>chr19-16380352-16380516_CPPN_MUT-allCrxSites</td>\n", " <td>42.384243</td>\n", " <td>0.933407</td>\n", " <td>0.406204</td>\n", " <td>1.301935</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACTAC</th>\n", " <td>chr1-44147572-44147736_UPPP_MUT-allCrxSites</td>\n", " <td>69.758888</td>\n", " <td>11.226812</td>\n", " <td>19.163280</td>\n", " <td>14.434499</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACTCG</th>\n", " <td>chr12-116230818-116230982_CPPE_WT</td>\n", " <td>67.233464</td>\n", " <td>78.898818</td>\n", " <td>70.894577</td>\n", " <td>85.276757</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACTGT</th>\n", " <td>chr5-65391346-65391510_CPPP_MUT-allCrxSites</td>\n", " <td>43.122810</td>\n", " <td>40.758772</td>\n", " <td>54.503043</td>\n", " <td>44.124283</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " label DNA \\\n", "barcode \n", "AACAACAAG chr7-141291911-141292075_UPPP_MUT-allCrxSites 3.144868 \n", "AACAACGTT chr19-16380352-16380516_CPPN_MUT-allCrxSites 42.384243 \n", "AACAACTAC chr1-44147572-44147736_UPPP_MUT-allCrxSites 69.758888 \n", "AACAACTCG chr12-116230818-116230982_CPPE_WT 67.233464 \n", "AACAACTGT chr5-65391346-65391510_CPPP_MUT-allCrxSites 43.122810 \n", "\n", " RNA1 RNA2 RNA3 \n", "barcode \n", "AACAACAAG 0.000000 0.000000 0.000000 \n", "AACAACGTT 0.933407 0.406204 1.301935 \n", "AACAACTAC 11.226812 19.163280 14.434499 \n", "AACAACTCG 78.898818 70.894577 85.276757 \n", "AACAACTGT 40.758772 54.503043 44.124283 " ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Normalizing RNA to DNA.\n", "Averaging across barcodes within a replicate.\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>RNA1</th>\n", " <th>RNA2</th>\n", " <th>RNA3</th>\n", " </tr>\n", " <tr>\n", " <th>label</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>BASAL</th>\n", " <td>0.196778</td>\n", " <td>0.218638</td>\n", " <td>0.236666</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-10229074-10229238_CPPE_MUT-allCrxSites</th>\n", " <td>7.325586</td>\n", " <td>5.922791</td>\n", " <td>6.286389</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-10229074-10229238_CPPE_WT</th>\n", " <td>6.418129</td>\n", " <td>5.188716</td>\n", " <td>4.976230</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-106171416-106171580_CSPE_MUT-shape</th>\n", " <td>0.282047</td>\n", " <td>0.264416</td>\n", " <td>0.290612</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-106171416-106171580_CSPE_WT</th>\n", " <td>0.260469</td>\n", " <td>0.276250</td>\n", " <td>0.212923</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " RNA1 RNA2 RNA3\n", "label \n", "BASAL 0.196778 0.218638 0.236666\n", "chr1-10229074-10229238_CPPE_MUT-allCrxSites 7.325586 5.922791 6.286389\n", "chr1-10229074-10229238_CPPE_WT 6.418129 5.188716 4.976230\n", "chr1-106171416-106171580_CSPE_MUT-shape 0.282047 0.264416 0.290612\n", "chr1-106171416-106171580_CSPE_WT 0.260469 0.276250 0.212923" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Normalizing to the basal Rho promoter.\n", "Computing p-values for the null hypothesis that a sequence is no different than the basal promoter alone.\n", "Done processing data!\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>expression</th>\n", " <th>expression_std</th>\n", " <th>expression_reps</th>\n", " <th>expression_pvalue</th>\n", " <th>expression_qvalue</th>\n", " </tr>\n", " <tr>\n", " <th>label</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>chr1-10229074-10229238_CPPE_MUT-allCrxSites</th>\n", " <td>30.293101</td>\n", " <td>6.011230</td>\n", " <td>3.0</td>\n", " <td>0.000003</td>\n", " <td>0.000128</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-10229074-10229238_CPPE_WT</th>\n", " <td>25.791454</td>\n", " <td>6.063103</td>\n", " <td>3.0</td>\n", " <td>0.000019</td>\n", " <td>0.000167</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-106171416-106171580_CSPE_MUT-shape</th>\n", " <td>1.290214</td>\n", " <td>0.124284</td>\n", " <td>3.0</td>\n", " <td>0.023905</td>\n", " <td>0.031469</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-106171416-106171580_CSPE_WT</th>\n", " <td>1.162281</td>\n", " <td>0.229405</td>\n", " <td>3.0</td>\n", " <td>0.226254</td>\n", " <td>0.246199</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-106171416-106171580_CSPE_scrambled</th>\n", " <td>1.995027</td>\n", " <td>0.380942</td>\n", " <td>3.0</td>\n", " <td>0.012703</td>\n", " <td>0.018175</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " expression expression_std \\\n", "label \n", "chr1-10229074-10229238_CPPE_MUT-allCrxSites 30.293101 6.011230 \n", "chr1-10229074-10229238_CPPE_WT 25.791454 6.063103 \n", "chr1-106171416-106171580_CSPE_MUT-shape 1.290214 0.124284 \n", "chr1-106171416-106171580_CSPE_WT 1.162281 0.229405 \n", "chr1-106171416-106171580_CSPE_scrambled 1.995027 0.380942 \n", "\n", " expression_reps \\\n", "label \n", "chr1-10229074-10229238_CPPE_MUT-allCrxSites 3.0 \n", "chr1-10229074-10229238_CPPE_WT 3.0 \n", "chr1-106171416-106171580_CSPE_MUT-shape 3.0 \n", "chr1-106171416-106171580_CSPE_WT 3.0 \n", "chr1-106171416-106171580_CSPE_scrambled 3.0 \n", "\n", " expression_pvalue \\\n", "label \n", "chr1-10229074-10229238_CPPE_MUT-allCrxSites 0.000003 \n", "chr1-10229074-10229238_CPPE_WT 0.000019 \n", "chr1-106171416-106171580_CSPE_MUT-shape 0.023905 \n", "chr1-106171416-106171580_CSPE_WT 0.226254 \n", "chr1-106171416-106171580_CSPE_scrambled 0.012703 \n", "\n", " expression_qvalue \n", "label \n", "chr1-10229074-10229238_CPPE_MUT-allCrxSites 0.000128 \n", "chr1-10229074-10229238_CPPE_WT 0.000167 \n", "chr1-106171416-106171580_CSPE_MUT-shape 0.031469 \n", "chr1-106171416-106171580_CSPE_WT 0.246199 \n", "chr1-106171416-106171580_CSPE_scrambled 0.018175 " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "image/png": "\n", "text/plain": [ "<Figure size 288x288 with 1 Axes>" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAARgAAAD/CAYAAAAquMkCAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAEflJREFUeJzt3X+wHWV9x/H3hwSNkgQMSbECSQoqsVKDcvFn+aVWBaVa6VgVgalOg7Z0ahWrtYAZBXH8gdTRRqOppJJSBbEiKk5VVPDnXDqgpAYsJSGIgUQkJiEEgp/+sRs9Odzcuzc5z/nF5zVzJvc8e/bs90nufLL77O6zsk1ERAl79bqAiBheCZiIKCYBExHFJGAiopgETEQUk4CJiGISMBFRTAImIopJwEREMVN7XcDumj17tufPn9/rMiIeka6//voNtudM9LmBDZj58+czOjra6zIiHpEkrWnyuRwiRUQxCZiIKCYBExHFJGAiopgETEQUk4CJiGISMBFRTAImIopJwEREMQN7JW8Mn/nv+HKvS2hs9fte2usSBkL2YCKimARMRBSTgImIYhIwEVFMAiYiiknAREQxCZiIKCYBExHFJGAiopgETEQUk4CJiGKG/l6k3N8S0TvZg4mIYhIwEVFMAiYiiknAREQxXQkYSY+WtEzSGkmbJN0g6YR62XxJlrS55XVON+qKiLK6dRZpKrAWOBa4HTgR+JykP2r5zH62t3epnojogq7swdjeYnux7dW2f2P7KuA24MhubD8ieqMnYzCSDgCeDKxsaV4j6Q5Jn5Y0uxd1RURndT1gJO0NrACW214FbACOAuZR7dHMqJePte4iSaOSRtevX9+tkiNiN3U1YCTtBXwGeAA4E8D2Ztujtrfbvqtuf5GkGe3r215qe8T2yJw5c7pZekTshq7dKiBJwDLgAOBE2w/u4qOu/8wp9IgB1817kZYATwFeaHvrjkZJzwLuBX4GPA74CPAt2xu7WFtEFNCt62DmAWcARwDrWq53OQU4BLga2ATcBGwDXtONuiKirK7swdheA2icj1zajToiorsyzhERxQz9fDARvTYocxKVmI8oezARUUwCJiKKScBERDEJmIgoJgETEcUkYCKimARMRBSTgImIYhIwEVFMAiYiiknAREQxCZiIKCYBExHFJGAiopgETEQUk/lgBtSgzDECZeYZicGQPZiIKCYBExHFJGAiopgETEQUk4CJiGISMBFRTKOAkfQWSUfUPz9b0u2SbpP0nLLlRcQga7oH8/fAbfXPFwAXAucBF5UoKiKGQ9ML7fa1vVHSDGAh1QPsH5L0oYK1RcSAaxowayU9F3gq8J06XGYCD5UrLSIGXdNDpLcBlwP/BLynbnsZ8KMmK0t6tKRlktZI2iTpBkkntCx/gaRVku6TdI2keZPpRET0p0YBY/srtp9ge77t6+vmy4CTGm5nKrAWOBbYFzgb+Jyk+ZJmA1cA5wCzgFHgs5PoQ0T0qUaHSJLusT2rtc32g5LuBn5vovVtbwEWtzRdJek24Ehgf2Cl7cvqbS0GNkhaYHtVo15ERF9qeoi0d3uDpL2BKbuzUUkHAE8GVlKN69y4Y1kdRrfW7e3rLZI0Kml0/fr1u7PpiOiicfdgJF0LGJgm6Tttiw8CvjfZDdbBtAJYbnuVpOlAe1psBGa0r2t7KbAUYGRkxJPddkR010SHSJ8CBBwFLGtpN3AX8M3JbEzSXsBngAeAM+vmzcDMto/OBDZN5rsjov+MGzC2lwNI+sGejodIElVIHQCcaPvBetFK4PSWz+0DHFq3R8QAazTIWx/KvAg4ApjetuzchttaAjyF6iK9rS3tXwA+IOlk4MvAucCPM8AbMfiankX6KPAq4BrgvpZFjcZB6utazgC2AeuqnRkAzrC9og6XjwKXAD8EXt2o+ojoa02v5H0tsND22t3ZiO01VGM5u1r+dWDB7nx3RPSvpqepNwD3liwkIoZP0z2YDwErJF1Adfbot2z/X8erioih0DRgltR/vqyt3ezmxXYRMfyankXKzHcRMWkJjogopulp6h23DDyM7WM6WlFEDI2mYzCfanv/eOANVNetRESMqekYzPL2NkmfBz4NvLvTRUXEcNiTMZifA0/rVCERMXyajsG8vq3pscArgR90vKKIGBpNx2BObXu/hWoumA93tpyIGCZNx2COL11IRAyfpnswSHoS8BrgQKrxl0tt/6xUYREx+Jo+OvYk4HqqO57vAQ4DRiX9acHaImLANd2DeS/wctvX7GiQdBzVHC5XFqgrIoZA09PUBwHXtrVdV7dHRIypacDcALy1re0tdXtExJiaHiK9CfiSpL+jekLjwVRTZzZ9smNEPAJNZtLvpwDPBp4A3An8sOXJABERD9P0St4jgF/avq6l7WBJs2zfOM6qEfEI1nQM5hIe/vjYR1E9RC0iYkxNA2Zu+9y7tm8F5ne8oogYGk0D5g5Jz2htqN/f2fmSImJYND2L9GHgi5LeD9xK9WjXs4DzSxUWEYOv6VmkT0q6l2oWu4OpTlW/1fblJYuLiMHW+GZH25cBlxWsJSKGTJ4qEBHFdC1gJJ0paVTSNkkXt7TPl2RJm1te53Srrogop/EhUgfcCZwHvBh4zBjL97O9vYv1RERhXQsY21cASBohd2FHPCLsMmAkNXocie1zO1TLGkkG/gt4m+0NY9S0CFgEMHfu3A5tNiJKGW8P5uAu1bABOIpq6of9gY8BK6gOpXZieymwFGBkZGTMJ01GRP/YZcDY/stuFGB7MzBav71L0pnALyTNsL2pGzVERBmTGoORNAOYDWhHW/s9Sh2wY88kp9AjBlzT6Rr+kOqwZSFVAIjfBcGUht8xtd7eFGCKpGnAduBI4F7gZ8DjgI8A37K9sXk3IqIfNd1L+BfgGmAW8GuqIPgEcPoktnU2sBV4B/C6+uezgUOAq4FNwE3ANqrHo0TEgGt6iLQQ+BPbD0qS7Y2S3kYVCJc0+QLbi4HFu1h8acM6ImKANN2DuZ/fTTi1QdLcet39i1QVEUOhacBcC7yq/vly4KvAt4FvligqIoZD0+kaXtXy9p1Uh0YzgOUlioqI4dD00bFn7fjZ9m9sX2J7CfDGYpVFxMBreoi0q9sBzu5UIRExfMY9RJL0/PrHKZKOp+UCO6rTy7nSNiJ2aaIxmGX1n9OAf21pN7AO+NsSRUXEcBg3YGz/AYCkf7N9WndKiohh0fQs0mn1pf7PBQ4E7gC+nwmiImI8Te9FOgy4imomurVUUzncL+kk2z8tWF9EDLCmZ5GWUM3DcrDt59g+CPg41T1KERFjahowRwAX2m6d5Omiuj0iYkxNA+ZO4Ni2tqPJo2MjYhxN76Z+J3ClpKuANcA84KVU0y5ERIyp0R6M7SuBZ/C7e5BuAo60/cWCtUXEgGt6Fuks2x+keq5Ra/tbbF9YpLKIGHi5Fykiism9SBFRTO5Fiohici9SRBTT9CxSwiUiJi0PN4uIYhIwEVFMAiYiiknAREQxexQwkn7SqUIiYvjs6R7MBR2pIiKGUtPnIj1+F4saP9lR0pmSRiVtk3Rx27IXSFol6T5J10ia1/R7I6J/Nd2DuWUX7f8ziW3dSXWzZOsVwUiaDVwBnAPMAkaBz07ieyOiTzWdD0YPa5BmAr9puiHbV9TrjQAHtSx6JbDS9mX18sXABkkLbK9q+v0R0X8mutlxLdV9R4+RdHvb4v2BSztQw1OBG3e8sb1F0q11ewImYoBNtAfzOqq9l68Ap7a0G7jL9s0dqGE6sL6tbSPVxFY7kbQIWAQwd+7cDmw6Ikqa6GbHb0M1TmL7vkI1bAZmtrXNZIypIGwvpXq6ASMjI25fHhH9pekg7yWSjm5tkHS0pMs7UMNKYGHL9+4DHFq3R8QAaxowxwLfa2v7PnB80w1JmippGjCFagKrafXTIr8AHC7p5Hr5ucCPM8AbMfiaBsz9wD5tbdOBByexrbOBrcA7qMZ2tgJn214PnAycD/wKeBbw6kl8b0T0qaanqb8GfELSGbZ/XZ+i/ihwddMN2V4MLN7Fsq8DC5p+V0QMhqZ7MG+lGni9R9LdwD3AvsCbSxUWEYOv0R6M7V8BL61vGTgYWGt7XdHKImLgTXSh3WOpxk4OB/4buCDBEhFNTXSI9DHgJKorav8c+GDxiiJiaEwUMC8BXmT7H4ATgJeVLykihsVEAbOP7V8A2F5LNbAbEdHIRIO8U9ue6Nj+HtuN54SJiEeWiQLmbnaev+WXPPwJj4d0uqiIGA4T3ew4v0t1RMQQylMFIqKYBExEFJOAiYhiEjARUUwCJiKKScBERDEJmIgoJgETEcUkYCKimARMRBSTgImIYhIwEVFMAiYiiknAREQxCZiIKCYBExHFJGAiopgETEQU0zcBI+lbku6XtLl+3dzrmiJiz/RNwNTOtD29fh3W62IiYs/0W8BExBDpt4C5QNIGSd+VdFyvi4mIPdNPAfN2qmcsHQgsBb4k6dDWD0haJGlU0uj69et7UWNETELfBIztH9reZHub7eXAd4ET2z6z1PaI7ZE5c+b0ptCIaKxvAmYMpuURtRExePoiYCTtJ+nFkqZJmirpFOAY4Ope1xYRu2+iZ1N3y97AecAC4CFgFfAK27f0tKqI2CN9ETC21wNH9bqOiOisvjhEiojhlICJiGISMBFRTAImIopJwEREMQmYiCgmARMRxSRgIqKYBExEFJOAiYhiEjARUUwCJiKKScBERDEJmIgoJgETEcUkYCKimARMRBSTgImIYhIwEVFMAiYiiknAREQxCZiIKCYBExHFJGAiopgETEQUk4CJiGISMBFRTN8EjKRZkr4gaYukNZJe2+uaImLPTO11AS0+BjwAHAAcAXxZ0o22V/a2rIjYXX2xByNpH+Bk4Bzbm21fB1wJnNrbyiJiT/RFwABPBrbbvqWl7UbgqT2qJyI6QLZ7XQOSjgYus/34lra/Ak6xfVxL2yJgUf32MODmbtbZZjawoYfbL2EY+wTD2a9e92me7TkTfahfxmA2AzPb2mYCm1obbC8FlnarqPFIGrU90us6OmkY+wTD2a9B6VO/HCLdAkyV9KSWtoVABngjBlhfBIztLcAVwLsl7SPpecDLgc/0trKI2BN9ETC1vwYeA9wNXAq8qc9PUffFoVqHDWOfYDj7NRB96otB3ogYTv20BxMRQyYBExHFJGBqklZL2ippk6R7JX1P0hsl7VUvv1iSJT2zZZ0nSnrYMWb92e2Sfr+bfai3vaMfmyWtq2uZ3lLXHvVB0uGSviZpw1jrldCFPp0u6XpJv5Z0h6T3Syp+CUcX+vVqSTdL2ijpbknLJbVfDlJUAmZnJ9meAcwD3ge8HVjWsvwe4LzxvqDltoeNwOsK1TmRk2xPp7qn6+nAP7Ys29M+PAh8DnhDx6ptpmSfHgu8meritWcBLwDO6kzZEyrZr+8Cz7O9L3AI1XVv435fpyVgxmB7o+0rgb8ATpd0eL1oOfA0SceOs/rJwL3Au4HTy1Y6PtvrgK9R/fLusEd9sH2z7WX06BqlQn1aYvta2w/Y/jmwAnheZysfX6F+rbXderXvQ8ATO1NxMwmYcdj+EXAHcHTddB/wXuD8cVY7neo0+38ACyQdWbTIcUg6CDgB+N+W5oHqQ7su9ekYuhygpfol6Y8lbaS6Kv5k4KJO1j2RBMzE7gRmtbz/BDBX0gntH5Q0Fzge+HfbdwHfAE7rSpU7+09Jm4C1VNcVvatt+SD0oV1X+iTp9cAI8MEO1j6eov2yfV19iHQQ8AFgdcd7MI4EzMQOpDoWBsD2NuA99avdqcBPbd9Qv18BvFbS3sWr3Nkr6rGk44AFVGMLvzUgfWhXvE+SXgFcAJzQdmhRUlf+repDv6up9nS6JgEzDklHUQXMdW2LPg3sB7yyrf004JD6jMA64EKqX5gTS9c6FtvfBi5m7P+NB6IP7Ur1SdJLgE9SDbr+pEDp4+rSv9VU4NCOFNxQv9xN3VfqU3nHAP8MXGL7J5J+u9z2dknvAj7Sss5zqP7xng6sb/m6D1H9InyxC6WP5SJgtaSFrY272wdVfxGPBh5VrzOt+jpvK9qLnXW6T8+n+t//z+pxt17pdL9OAa61fbukeVRjOd8o3Ied2c6rul1iNbCVajBsI/B94G+AKfXyi4HzWj6/F3BT9VdogI8Dnx/je58JbANmdbEfL2xrWwJ8vhN9AOYDbnutHvA+XQNsp5o2ZMfrq0Pwb3U+1UmKLfWfS4H9u/F7uOOVe5EiopiMwUREMQmYiCgmARMRxSRgIqKYBExEFJOAiYhiEjARUUwCJiKKScBERDH/Dyuvv2O9dSNEAAAAAElFTkSuQmCC\n", "text/plain": [ "<Figure size 288x288 with 1 Axes>" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" }, { "data": { "image/png": "\n", "text/plain": [ "<Figure size 288x288 with 1 Axes>" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAARgAAAD/CAYAAAAquMkCAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAEfZJREFUeJzt3XuQXnV9x/H3B0JFIQEhK7ZAkoIKVmtQgtdyU6uCUq10rDdgqtOobTq1itU6oKlFcbwgdbDRaCopSamAqAiK4wUVvOAsDqipAUshBBHYSIlJgEDg0z/OWX14stk9m31+z43Pa+aZ3f2d8zzn+8vufHLO73cusk1ERAm79LqAiBheCZiIKCYBExHFJGAiopgETEQUk4CJiGISMBFRTAImIopJwEREMbN6XcDOmjt3rhcsWNDrMiIeka655poNtkemWm9gA2bBggWMjo72uoyIRyRJ65qsl0OkiCgmARMRxSRgIqKYBExEFJOAiYhiEjARUUwCJiKKScBERDEJmIgoZmDP5G1qwbsu63UJjd38wZf2uoSeyu9q+GQPJiKKScBERDEJmIgoJgETEcUkYCKimARMRBSTgImIYhIwEVFMVwJG0qMkrZC0TtImSddKOq5etkCSJW1ueZ3ejboioqxunck7C1gPHA3cAhwPXCDpj1vW2dv2ti7VExFd0JU9GNtbbC+1fbPth2xfCtwEHN6N7UdEb/RkDEbSfsCTgDUtzesk3Srps5Lm7uB9iyWNShodGxvrSq0RsfO6HjCSdgNWAyttrwU2AEcA86n2aGbXy7dje7ntRbYXjYxM+UiWiOixrl5NLWkX4DzgfmAJgO3NwPgDju6QtAT4laTZtjd1s76I6KyuBYwkASuA/YDjbT+wg1Vdf80UesSA6+YezDLgycALbd873ijpWcDdwC+AxwIfB75te2MXa4uIAroSMJLmA28CtgK3VzszULc9BHwAeBzwG+DrwGu6Udcgy82ZYhB0JWBsrwM0ySrnd6OOiOiujHNERDEJmIgoJgETEcUkYCKimARMRBSTgImIYhIwEVFMAiYiiknAREQxCZiIKCYBExHFJGAiopgETEQU09U72kU8Eg3KrTVK3FYjezARUUwCJiKKScBERDEJmIgoJgETEcUkYCKimARMRBSTgImIYhIwEVFMAiYiiknAREQxCZiIKCYBExHFJGAiopgETEQU05WAkfQoSSskrZO0SdK1ko5rWf4CSWsl3SPpCknzu1FXRJTVrT2YWcB64GhgL+A04AJJCyTNBS4GTgf2AUaBz3WprogoqCt3tLO9BVja0nSppJuAw4F9gTW2LwSQtBTYIOlQ22u7UV9ElNFoD0bS2yQdVn//bEm3SLpJ0nN2ZqOS9gOeBKwBngJcN76sDqMb6/b29y2WNCppdGxsbGc2HRFd1PQQ6R+Am+rvzwTOAs4Azp7uBiXtBqwGVtZ7KHsCG9tW2wjMbn+v7eW2F9leNDIyMt1NR0SXNT1E2sv2RkmzgYXAC20/KOmj09mYpF2A84D7gSV182ZgTtuqc4BN0/nsiOg/TQNmvaTnUh22fLcOlznAg003JEnACmA/4HjbD9SL1gCntKy3B3Bw3R4RA6xpwLwDuIhqz+PEuu1lwI+msa1lwJOp9n7ubWn/AvBhSScClwHvAX6SAd6IwdcoYGx/BfiDtuYLgQuavL8+r+VNwFbg9mpnBoA32V5dh8s5wCrgauDVTT43Ivpbo4CRdJftfVrbbD8g6U7gcVO93/Y6QJMs/wZwaJNaImJwNJ1F2q29oZ4N2rWz5UTEMJl0D0bSlYCB3SV9t23xAcD3SxUWEYNvqkOkz1Ad2hxBNQM0zsAdwLcK1RURQ2DSgLG9EkDSDzOrExHT1XQWaa2kFwGHUZ1527rsPSUKi4jB13QW6RzgVcAVwD0ti1yiqIgYDk1PtHstsND2+pLFRMRwaTpNvQG4u2QhETF8mu7BfBRYLelMqtmj37L9vx2vKiKGQtOAWVZ/fVlbu8nJdhGxA01nkXJz8IiYtgRHRBTTdJp6/JKB7dg+qqMVRcTQaDoG85m2nx8PvJHq9goRERNqOgazsr1N0ueBzwLv63RRETEcZjIG80vgaZ0qJCKGT9MxmDe0NT0GeCXww45XFBFDo+kYzEltP2+huhfMxzpbTkQMk6ZjMMeWLiQihk/jR8dKeiLwGmB/qvGX823/olRhETH4mj469gTgGqobc98FHAKMSvqzgrVFxIBrugfzAeDltq8Yb5B0DNWjRi4pUFdEDIGm09QHAFe2tV1Vt0dETKhpwFwLvL2t7W11e0TEhJoeIr0F+LKkvwfWAwdS3TrzhFKFRcTgm85Nv58MPJvqEbK3AVe3PMA+ImI7Tc/kPQz4te2rWtoOlLSP7euKVRcRA63pGMwqtn987O8B53W2nIgYJk0DZl77vXdt3wgsaLohSUskjUraKunclvYFkixpc8vr9KafGxH9q+kg762SnmH7x+MNkp5BNRbT1G3AGcCLgUdPsHxv29um8XkR0eeaBszHgC9J+hBwI3AwcCrw/qYbsn0xgKRF5PyZiEeEprNIn5Z0N9Vd7A6kmqp+u+2LOljLOkkGvg68w/aGDn52RPRA44sdbV8IXFighg3AEVQn7e0LfAJYTXUo9TCSFgOLAebNm1eglIjopJ4/VcD2ZtujtrfZvgNYArxI0uwJ1l1ue5HtRSMjI90vNiKmpecBM4Hxpxf0Y20RMQ2ND5FmStKsenu7ArtK2h3YBhxO9dzrXwCPBT4OfNv2xm7VFhFldHMv4TTgXuBdwOvr708DDgIuBzYBPwO2Ut3YKiIG3A73YCQ1ehyJ7fc0XG8psHQHi89v8hkRMVgmO0Q6sGtVRMRQ2mHA2P6rbhYSEcNnWoO89dTxXEDjbe3XKEVEjGt6u4Y/ojr5bSHVNLL43XTyrmVKi4hB13QW6d+AK4B9gN9QTSd/CjilUF0RMQSaHiItBP7U9gOSZHujpHdQTSuvKldeRAyypnsw9/G7G05tkDSvfu++RaqKiKHQNGCuBF5Vf38R8FXgO8C3ShQVEcOh6e0aXtXy47upDo1mAytLFBURw6Hpo2NPHf/e9kO2V9leBry5WGURMfCaHiLt6HKA0zpVSEQMn0kPkSQ9v/52V0nH0nKCHdVFiptKFRYRg2+qMZgV9dfdgX9vaTdwO/B3JYqKiOEwacDY/kMASf9h++TulBQRw6LpLNLJ9Q2jngvsD9wK/CCPGYmIyTS9FukQ4FKq5xmtp7qVw32STrD984L1RcQAazqLtAxYDhxo+zm2DwA+SXWNUkTEhJoGzGHAWbbd0nZ23R4RMaGmAXMbcHRb25FM79GxEfEI0/Rq6ncDl0i6FFgHzAdeSnXz7oiICTXag7F9CfAMfncN0s+Aw21/qWBtETHgms4inWr7I8AZbe1vs31WkcoiYuDlWqSIKCbXIkVEMbkWKSKKybVIEVFM01mkhEtETFvTQd6IiGlLwEREMV0LGElLJI1K2irp3LZlL5C0VtI9kq6QNL9bdUVEOd3cg7mN6kS91tkoJM0FLgZOp3py5CjwuS7WFRGFzChgJP206bq2L7b9ReDXbYteCayxfaHt+4ClwEJJh86ktojovZnuwZzZgRqeAlw3/oPtLcCNdXtEDLCmz0V6/A4WdeLJjnsCG9vaNlJdVNlex+J6HGd0bGysA5uOiJKa7sHcsIP2/+5ADZuBOW1tc5jgMgTby20vsr1oZGSkA5uOiJKaBoy2a5DmAA91oIY1wMKWz90DOLhuj4gBNtXFjuuprjt6tKRb2hbvC5zfdEP1UwlmAbtSXTy5O7AN+ALwYUknApdRXbn9E9trG/ciIvrSVBc7vp5q7+UrwEkt7QbusH39NLZ1GvDets/+Z9tL63A5B1gFXA28ehqfGxF9aqqLHb8D1bkqtu+ZyYZsL6Wagp5o2TeATEtHDJmmYzCrJB3Z2iDpSEkXFagpIoZE04A5Gvh+W9sPgGM7W05EDJOmAXMfsEdb257AA50tJyKGSdOA+RrwqXpqenyK+hzg8lKFRcTgaxowb6c6+e0uSXcCdwF7AW8tVVhEDL5Gjy2x/X/AS+tLBg4E1tu+vWhlETHwpjrR7jFU5688FfgxcGaCJSKamuoQ6RPACcBa4C+AjxSvKCKGxlQB8xLgRbb/ETgOeFn5kiJiWEwVMHvY/hWA7fVUA7sREY1MNcg7q+2Jju0/Y7sT94SJiCE0VcDcycPvoftrtn/C40GdLioihsNUFzsu6FIdETGE8lykiCgmARMRxSRgIqKYBExEFJOAiYhiEjARUUwCJiKKScBERDEJmIgoJgETEcUkYCKimARMRBSTgImIYhIwEVFMAiYiiknAREQxfRMwkr4t6T5Jm+vX9b2uKSJmpm8CprbE9p7165BeFxMRM9NvARMRQ6TfAuZMSRskfU/SMb0uJiJmpp8C5p1UTyjYH1gOfFnSwa0rSFosaVTS6NjYWC9qjIhp6JuAsX217U22t9peCXwPOL5tneW2F9leNDIy0ptCI6KxvgmYCZiWB7xFxODpi4CRtLekF0vaXdIsSa8DjgIu73VtEbHzpnqyY7fsBpwBHAo8CKwFXmH7hp5WFREz0hcBY3sMOKLXdUREZ/XFIVJEDKcETEQUk4CJiGISMBFRTAImIopJwEREMQmYiCgmARMRxSRgIqKYBExEFJOAiYhiEjARUUwCJiKKScBERDEJmIgoJgETEcUkYCKimARMRBSTgImIYhIwEVFMAiYiiknAREQxCZiIKCYBExHFJGAiopgETEQUk4CJiGISMBFRTAImIorpm4CRtI+kL0jaImmdpNf2uqaImJlZvS6gxSeA+4H9gMOAyyRdZ3tNb8uKiJ3VF3swkvYATgROt73Z9lXAJcBJva0sImaiLwIGeBKwzfYNLW3XAU/pUT0R0QGy3esakHQkcKHtx7e0/TXwOtvHtLQtBhbXPx4CXN/NOtvMBTb0cPslDGOfYDj71es+zbc9MtVK/TIGsxmY09Y2B9jU2mB7ObC8W0VNRtKo7UW9rqOThrFPMJz9GpQ+9csh0g3ALElPbGlbCGSAN2KA9UXA2N4CXAy8T9Iekp4HvBw4r7eVRcRM9EXA1P4GeDRwJ3A+8JY+n6Lui0O1DhvGPsFw9msg+tQXg7wRMZz6aQ8mIoZMAiYiiknA1CTdLOleSZsk3S3p+5LeLGmXevm5kizpmS3veYKk7Y4x63W3Sfr9bvah3vZ4PzZLur2uZc+WumbUB0lPlfQ1SRsmel8JXejTKZKukfQbSbdK+pCk4qdwdKFfr5Z0vaSNku6UtFJS++kgRSVgHu4E27OB+cAHgXcCK1qW3wWcMdkHtFz2sBF4faE6p3KC7T2prul6OvBPLctm2ocHgAuAN3as2mZK9ukxwFupTl57FvAC4NTOlD2lkv36HvA823sBB1Gd9zbp53VaAmYCtjfavgT4S+AUSU+tF60Enibp6EnefiJwN/A+4JSylU7O9u3A16j+eMfNqA+2r7e9gh6do1SoT8tsX2n7ftu/BFYDz+ts5ZMr1K/1tlvP9n0QeEJnKm4mATMJ2z8CbgWOrJvuAT4AvH+St51CNc3+X8Chkg4vWuQkJB0AHAf8T0vzQPWhXZf6dBRdDtBS/ZL0J5I2Up0VfyJwdifrnkoCZmq3Afu0/PwpYJ6k49pXlDQPOBb4T9t3AN8ETu5KlQ/3RUmbgPVU5xW9t235IPShXVf6JOkNwCLgIx2sfTJF+2X7qvoQ6QDgw8DNHe/BJBIwU9uf6lgYANtbgX+pX+1OAn5u+9r659XAayXtVrzKh3tFPZZ0DHAo1djCbw1IH9oV75OkVwBnAse1HVqU1JXfVX3odznVnk7XJGAmIekIqoC5qm3RZ4G9gVe2tZ8MHFTPCNwOnEX1B3N86VonYvs7wLlM/L/xQPShXak+SXoJ8GmqQdefFih9Ul36Xc0CDu5IwQ31y9XUfaWeyjsK+Fdgle2fSvrtctvbJL0X+HjLe55D9ct7OjDW8nEfpfpD+FIXSp/I2cDNkha2Nu5sH1T9QzwK+L36PbtXH+etRXvxcJ3u0/Op/vf/83rcrVc63a/XAVfavkXSfKqxnG8W7sPD2c6rulziZuBeqsGwjcAPgL8Fdq2Xnwuc0bL+LsDPqn9CA3wS+PwEn/tMYCuwTxf78cK2tmXA5zvRB2AB4LbXzQPepyuAbVS3DRl/fXUIflfvp5qk2FJ/XQ7s242/w/FXrkWKiGIyBhMRxSRgIqKYBExEFJOAiYhiEjARUUwCJiKKScBERDEJmIgoJgETEcX8P56Kx/+sQsOgAAAAAElFTkSuQmCC\n", "text/plain": [ "<Figure size 288x288 with 1 Axes>" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Process data for the Rho promoter: convert counts into activity scores for each sequence\n", "library_names = [\"library1\", \"library2\"]\n", "rho_activity_data = {} # {library name: pd.DataFrame}\n", "barcode_count_dir = os.path.join(data_dir, \"Rhodopsin\")\n", "\n", "for library in library_names:\n", " print(f\"Processing data for {library} with the Rho promoter...\")\n", " # File names\n", " barcode_count_files = [\n", " os.path.join(barcode_count_dir, f\"{library}{sample}.counts\")\n", " for sample in [\"Plasmid\", \"Rna1\", \"Rna2\", \"Rna3\"]\n", " ]\n", " \n", " # Masks and metadata for downstream functions\n", " sample_labels = np.array([\"DNA\", \"RNA1\", \"RNA2\", \"RNA3\"])\n", " sample_rna_mask = np.array([False, True, True, True])\n", " rna_labels = sample_labels[sample_rna_mask]\n", " dna_labels = sample_labels[np.logical_not(sample_rna_mask)]\n", " n_samples = len(sample_labels)\n", " n_rna_samples = len(rna_labels)\n", " n_dna_samples = len(dna_labels)\n", " n_barcodes_per_sequence = 3\n", " \n", " # Read in the barcode counts\n", " print(\"Reading in barcode counts.\")\n", " all_sample_counts_df = quality_control.read_bc_count_files(barcode_count_files, sample_labels)\n", " display(all_sample_counts_df.head())\n", " \n", " # Remove barcodes that are detection-limited.\n", " # Barcodes below the DNA cutoff are NaN (because they are missing from the input plasmid pool)\n", " # Barcodes below any of the RNA cutoffs are zero in all replicates\n", " print(\"Removing detection-limited barcodes and normalizing to counts per million.\")\n", " cutoffs = [10, 5, 5, 5]\n", " threshold_sample_counts_df = quality_control.filter_low_counts(all_sample_counts_df, sample_labels, cutoffs,\n", " dna_labels=dna_labels, bc_per_seq=n_barcodes_per_sequence)\n", " display(threshold_sample_counts_df.head())\n", "\n", " # Normalize RNA barcode counts by plasmid barcode counts\n", " print(\"Normalizing RNA to DNA.\")\n", " normalized_sample_counts_df = quality_control.normalize_rna_by_dna(threshold_sample_counts_df, rna_labels, dna_labels)\n", " # Drop DNA\n", " barcode_sample_counts_df = normalized_sample_counts_df.drop(columns=dna_labels)\n", " \n", " # Average across barcodes\n", " print(\"Averaging across barcodes within a replicate.\")\n", " activity_replicate_df = quality_control.average_barcodes(barcode_sample_counts_df)\n", " display(activity_replicate_df.head())\n", " \n", " # Basal-normalize, average across replicates, do statistics\n", " print(\"Normalizing to the basal Rho promoter.\")\n", " sequence_expression_df = quality_control.basal_normalize(activity_replicate_df, \"BASAL\")\n", " print(\"Computing p-values for the null hypothesis that a sequence is no different than the basal promoter alone.\")\n", " sequence_expression_df[\"expression_pvalue\"] = quality_control.log_ttest_vs_basal(activity_replicate_df, \"BASAL\")\n", " sequence_expression_df[\"expression_qvalue\"] = modeling.fdr(sequence_expression_df[\"expression_pvalue\"])\n", " print(f\"Done processing data!\")\n", " display(sequence_expression_df.head())\n", " \n", " rho_activity_data[library] = sequence_expression_df" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Processing data for library1 with the Polylinker...\n", "Reading in barcode counts.\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>label</th>\n", " <th>DNA</th>\n", " <th>RNA1</th>\n", " <th>RNA2</th>\n", " <th>RNA3</th>\n", " </tr>\n", " <tr>\n", " <th>barcode</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>AACAACAAG</th>\n", " <td>chr16-87432635-87432799_CPPQ_scrambled</td>\n", " <td>987</td>\n", " <td>2</td>\n", " <td>3</td>\n", " <td>10</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACCGC</th>\n", " <td>chr4-119112319-119112483_CPPE_WT</td>\n", " <td>1326</td>\n", " <td>4963</td>\n", " <td>4554</td>\n", " <td>17827</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACGGG</th>\n", " <td>chr7-128854234-128854398_UPCE_WT</td>\n", " <td>35</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>2</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACTAC</th>\n", " <td>chr4-138107597-138107761_UPPE_WT</td>\n", " <td>5</td>\n", " <td>8</td>\n", " <td>6</td>\n", " <td>4</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACTGT</th>\n", " <td>chr5-31298508-31298672_CPPE_WT</td>\n", " <td>5007</td>\n", " <td>934</td>\n", " <td>993</td>\n", " <td>575</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " label DNA RNA1 RNA2 RNA3\n", "barcode \n", "AACAACAAG chr16-87432635-87432799_CPPQ_scrambled 987 2 3 10\n", "AACAACCGC chr4-119112319-119112483_CPPE_WT 1326 4963 4554 17827\n", "AACAACGGG chr7-128854234-128854398_UPCE_WT 35 0 0 2\n", "AACAACTAC chr4-138107597-138107761_UPPE_WT 5 8 6 4\n", "AACAACTGT chr5-31298508-31298672_CPPE_WT 5007 934 993 575" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Removing barcodes missing from the DNA pool and normalizing to counts per million.\n", "Removing detection-limited barcodes and normalizing to counts per million.\n", "Barcodes missing in DNA:\n", "Sample DNA: 1722 barcodes\n", "1722 barcodes are missing from more than 0 DNA samples.\n", "Barcodes off in RNA:\n", "Sample RNA1: 0 barcodes\n", "Sample RNA2: 0 barcodes\n", "Sample RNA3: 0 barcodes\n", "0 barcodes are off in more than 0 RNA samples.\n", "There are a total of 92.122 million barcode counts.\n", "Now removing RNA barcodes missing from any replicate.\n", "Barcodes missing in DNA:\n", "Sample DNA: 0 barcodes\n", "0 barcodes are missing from more than 0 DNA samples.\n", "Barcodes off in RNA:\n", "Sample RNA1: 5842 barcodes\n", "Sample RNA2: 11412 barcodes\n", "Sample RNA3: 9805 barcodes\n", "12991 barcodes are off in more than 0 RNA samples.\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>label</th>\n", " <th>DNA</th>\n", " <th>RNA1</th>\n", " <th>RNA2</th>\n", " <th>RNA3</th>\n", " </tr>\n", " <tr>\n", " <th>barcode</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>AACAACAAG</th>\n", " <td>chr16-87432635-87432799_CPPQ_scrambled</td>\n", " <td>48.214705</td>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACCGC</th>\n", " <td>chr4-119112319-119112483_CPPE_WT</td>\n", " <td>64.774771</td>\n", " <td>238.306557</td>\n", " <td>198.604223</td>\n", " <td>639.087016</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACGGG</th>\n", " <td>chr7-128854234-128854398_UPCE_WT</td>\n", " <td>NaN</td>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACTAC</th>\n", " <td>chr4-138107597-138107761_UPPE_WT</td>\n", " <td>NaN</td>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACTGT</th>\n", " <td>chr5-31298508-31298672_CPPE_WT</td>\n", " <td>244.590708</td>\n", " <td>44.847537</td>\n", " <td>43.305664</td>\n", " <td>20.613397</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " label DNA RNA1 \\\n", "barcode \n", "AACAACAAG chr16-87432635-87432799_CPPQ_scrambled 48.214705 0.000000 \n", "AACAACCGC chr4-119112319-119112483_CPPE_WT 64.774771 238.306557 \n", "AACAACGGG chr7-128854234-128854398_UPCE_WT NaN 0.000000 \n", "AACAACTAC chr4-138107597-138107761_UPPE_WT NaN 0.000000 \n", "AACAACTGT chr5-31298508-31298672_CPPE_WT 244.590708 44.847537 \n", "\n", " RNA2 RNA3 \n", "barcode \n", "AACAACAAG 0.000000 0.000000 \n", "AACAACCGC 198.604223 639.087016 \n", "AACAACGGG 0.000000 0.000000 \n", "AACAACTAC 0.000000 0.000000 \n", "AACAACTGT 43.305664 20.613397 " ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Normalizing RNA to DNA.\n", "Averaging across barcodes within a replicate.\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>RNA1</th>\n", " <th>RNA2</th>\n", " <th>RNA3</th>\n", " </tr>\n", " <tr>\n", " <th>label</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>BASAL</th>\n", " <td>0.742818</td>\n", " <td>0.983263</td>\n", " <td>1.267636</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-104768570-104768734_UPCQ_MUT-allCrxSites</th>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-104768570-104768734_UPCQ_WT</th>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-106008207-106008371_CPPE_MUT-allCrxSites</th>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-106008207-106008371_CPPE_WT</th>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " RNA1 RNA2 RNA3\n", "label \n", "BASAL 0.742818 0.983263 1.267636\n", "chr1-104768570-104768734_UPCQ_MUT-allCrxSites 0.000000 0.000000 0.000000\n", "chr1-104768570-104768734_UPCQ_WT 0.000000 0.000000 0.000000\n", "chr1-106008207-106008371_CPPE_MUT-allCrxSites 0.000000 0.000000 0.000000\n", "chr1-106008207-106008371_CPPE_WT 0.000000 0.000000 0.000000" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Removing the 'basal' promoter (Polylinker) and averaging across replicates. No statistical analysis is performed here.\n", "Done processing data!\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>expression</th>\n", " <th>expression_SEM</th>\n", " </tr>\n", " <tr>\n", " <th>label</th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>chr1-104768570-104768734_UPCQ_MUT-allCrxSites</th>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-104768570-104768734_UPCQ_WT</th>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-106008207-106008371_CPPE_MUT-allCrxSites</th>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-106008207-106008371_CPPE_WT</th>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-106171416-106171580_CSPE_scrambled</th>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " expression expression_SEM\n", "label \n", "chr1-104768570-104768734_UPCQ_MUT-allCrxSites 0.0 0.0\n", "chr1-104768570-104768734_UPCQ_WT 0.0 0.0\n", "chr1-106008207-106008371_CPPE_MUT-allCrxSites 0.0 0.0\n", "chr1-106008207-106008371_CPPE_WT 0.0 0.0\n", "chr1-106171416-106171580_CSPE_scrambled 0.0 0.0" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Processing data for library2 with the Polylinker...\n", "Reading in barcode counts.\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>label</th>\n", " <th>DNA</th>\n", " <th>RNA1</th>\n", " <th>RNA2</th>\n", " <th>RNA3</th>\n", " </tr>\n", " <tr>\n", " <th>barcode</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>AACAACAAG</th>\n", " <td>chr7-141291911-141292075_UPPP_MUT-allCrxSites</td>\n", " <td>3</td>\n", " <td>20</td>\n", " <td>15</td>\n", " <td>21</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACGTT</th>\n", " <td>chr19-16380352-16380516_CPPN_MUT-allCrxSites</td>\n", " <td>990</td>\n", " <td>10</td>\n", " <td>9</td>\n", " <td>10</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACTAC</th>\n", " <td>chr1-44147572-44147736_UPPP_MUT-allCrxSites</td>\n", " <td>1056</td>\n", " <td>2</td>\n", " <td>4</td>\n", " <td>3</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACTCG</th>\n", " <td>chr12-116230818-116230982_CPPE_WT</td>\n", " <td>7</td>\n", " <td>4</td>\n", " <td>6</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACTGT</th>\n", " <td>chr5-65391346-65391510_CPPP_MUT-allCrxSites</td>\n", " <td>1653</td>\n", " <td>1441</td>\n", " <td>9</td>\n", " <td>4695</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " label DNA RNA1 RNA2 \\\n", "barcode \n", "AACAACAAG chr7-141291911-141292075_UPPP_MUT-allCrxSites 3 20 15 \n", "AACAACGTT chr19-16380352-16380516_CPPN_MUT-allCrxSites 990 10 9 \n", "AACAACTAC chr1-44147572-44147736_UPPP_MUT-allCrxSites 1056 2 4 \n", "AACAACTCG chr12-116230818-116230982_CPPE_WT 7 4 6 \n", "AACAACTGT chr5-65391346-65391510_CPPP_MUT-allCrxSites 1653 1441 9 \n", "\n", " RNA3 \n", "barcode \n", "AACAACAAG 21 \n", "AACAACGTT 10 \n", "AACAACTAC 3 \n", "AACAACTCG 0 \n", "AACAACTGT 4695 " ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Removing barcodes missing from the DNA pool and normalizing to counts per million.\n", "Removing detection-limited barcodes and normalizing to counts per million.\n", "Barcodes missing in DNA:\n", "Sample DNA: 2107 barcodes\n", "2107 barcodes are missing from more than 0 DNA samples.\n", "Barcodes off in RNA:\n", "Sample RNA1: 0 barcodes\n", "Sample RNA2: 0 barcodes\n", "Sample RNA3: 0 barcodes\n", "0 barcodes are off in more than 0 RNA samples.\n", "There are a total of 89.662 million barcode counts.\n", "Now removing RNA barcodes missing from any replicate.\n", "Barcodes missing in DNA:\n", "Sample DNA: 0 barcodes\n", "0 barcodes are missing from more than 0 DNA samples.\n", "Barcodes off in RNA:\n", "Sample RNA1: 12647 barcodes\n", "Sample RNA2: 12055 barcodes\n", "Sample RNA3: 10999 barcodes\n", "13873 barcodes are off in more than 0 RNA samples.\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>label</th>\n", " <th>DNA</th>\n", " <th>RNA1</th>\n", " <th>RNA2</th>\n", " <th>RNA3</th>\n", " </tr>\n", " <tr>\n", " <th>barcode</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>AACAACAAG</th>\n", " <td>chr7-141291911-141292075_UPPP_MUT-allCrxSites</td>\n", " <td>NaN</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACGTT</th>\n", " <td>chr19-16380352-16380516_CPPN_MUT-allCrxSites</td>\n", " <td>38.377926</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACTAC</th>\n", " <td>chr1-44147572-44147736_UPPP_MUT-allCrxSites</td>\n", " <td>40.936454</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACTCG</th>\n", " <td>chr12-116230818-116230982_CPPE_WT</td>\n", " <td>NaN</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>AACAACTGT</th>\n", " <td>chr5-65391346-65391510_CPPP_MUT-allCrxSites</td>\n", " <td>64.079506</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " label DNA RNA1 \\\n", "barcode \n", "AACAACAAG chr7-141291911-141292075_UPPP_MUT-allCrxSites NaN 0.0 \n", "AACAACGTT chr19-16380352-16380516_CPPN_MUT-allCrxSites 38.377926 0.0 \n", "AACAACTAC chr1-44147572-44147736_UPPP_MUT-allCrxSites 40.936454 0.0 \n", "AACAACTCG chr12-116230818-116230982_CPPE_WT NaN 0.0 \n", "AACAACTGT chr5-65391346-65391510_CPPP_MUT-allCrxSites 64.079506 0.0 \n", "\n", " RNA2 RNA3 \n", "barcode \n", "AACAACAAG 0.0 0.0 \n", "AACAACGTT 0.0 0.0 \n", "AACAACTAC 0.0 0.0 \n", "AACAACTCG 0.0 0.0 \n", "AACAACTGT 0.0 0.0 " ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Normalizing RNA to DNA.\n", "Averaging across barcodes within a replicate.\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>RNA1</th>\n", " <th>RNA2</th>\n", " <th>RNA3</th>\n", " </tr>\n", " <tr>\n", " <th>label</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>BASAL</th>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-10229074-10229238_CPPE_MUT-allCrxSites</th>\n", " <td>1.486824</td>\n", " <td>0.405204</td>\n", " <td>1.305344</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-10229074-10229238_CPPE_WT</th>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-106171416-106171580_CSPE_MUT-shape</th>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-106171416-106171580_CSPE_WT</th>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " RNA1 RNA2 RNA3\n", "label \n", "BASAL 0.000000 0.000000 0.000000\n", "chr1-10229074-10229238_CPPE_MUT-allCrxSites 1.486824 0.405204 1.305344\n", "chr1-10229074-10229238_CPPE_WT 0.000000 0.000000 0.000000\n", "chr1-106171416-106171580_CSPE_MUT-shape 0.000000 0.000000 0.000000\n", "chr1-106171416-106171580_CSPE_WT 0.000000 0.000000 0.000000" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Removing the 'basal' promoter (Polylinker) and averaging across replicates. No statistical analysis is performed here.\n", "Done processing data!\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>expression</th>\n", " <th>expression_SEM</th>\n", " </tr>\n", " <tr>\n", " <th>label</th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>chr1-10229074-10229238_CPPE_MUT-allCrxSites</th>\n", " <td>1.06579</td>\n", " <td>0.334422</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-10229074-10229238_CPPE_WT</th>\n", " <td>0.00000</td>\n", " <td>0.000000</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-106171416-106171580_CSPE_MUT-shape</th>\n", " <td>0.00000</td>\n", " <td>0.000000</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-106171416-106171580_CSPE_WT</th>\n", " <td>0.00000</td>\n", " <td>0.000000</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-106171416-106171580_CSPE_scrambled</th>\n", " <td>0.00000</td>\n", " <td>0.000000</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " expression expression_SEM\n", "label \n", "chr1-10229074-10229238_CPPE_MUT-allCrxSites 1.06579 0.334422\n", "chr1-10229074-10229238_CPPE_WT 0.00000 0.000000\n", "chr1-106171416-106171580_CSPE_MUT-shape 0.00000 0.000000\n", "chr1-106171416-106171580_CSPE_WT 0.00000 0.000000\n", "chr1-106171416-106171580_CSPE_scrambled 0.00000 0.000000" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "image/png": "\n", "text/plain": [ "<Figure size 288x288 with 1 Axes>" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAARgAAAD/CAYAAAAquMkCAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAEvxJREFUeJzt3XuQXnV9x/H3hwQJ5AKGrFCBZAsKoaBBs4hIuYmDcola0rEgt6naIDadUi6FOoCpchsFpIwaWaWQSkq5KwLCKEblJs7S4ZYaoJSEIAQ2XJZcIBD49o9zVk8eNrtns8/vPJd8XjPPZJ/fOc9zvr9k55Nz+52fIgIzsxQ2aXQBZta+HDBmlowDxsySccCYWTIOGDNLxgFjZsk4YMwsGQeMmSXjgDGzZEY3uoANNWnSpOjs7Gx0GWYbpQceeGB5RHQMtV7LBkxnZyc9PT2NLsNsoyRpSZn1fIhkZsk4YMwsGQeMmSXjgDGzZBwwZpaMA8bMknHAmFkylQWMpKskPSfpVUmPS/pSYdlBkhZJWi1pgaQpVdVlZulUuQdzPtAZEROATwPnSJouaRJwI3AWMBHoAa6psC4zS6SyO3kjYmHxbf7aCZgOLIyI6wAkzQGWS5oaEYuqqs8slc4zbm10CaUsvuCwun9npedgJH1P0mpgEfAccBuwG/BQ/zoRsQp4Mm83sxZWacBExFeA8cC+ZIdFa4BxQF/Nqn35euuQNEtSj6Se3t7e1OWa2QhVfhUpIt6KiLuB7YETgZXAhJrVJgArBvhsd0R0RURXR8eQAznNrMEaeZl6NNk5mIXAtP5GSWML7WbWwioJGEnvkXSkpHGSRkn6JHAUcCdwE7C7pJmSxgBnAw/7BK9Z66tqDybIDoeeAV4GLgROioibI6IXmAmcmy/bCziyorrMLKFKLlPnIbL/IMt/AUytohYzq46HCphZMg4YM0vGAWNmyThgzCwZB4yZJeOAMbNkHDBmlowDxsySccCYWTIOGDNLxgFjZsk4YMwsGQeMmSXjgDGzZBwwZpaMA8bMknHAmFkyDhgzS8YBY2bJOGDMLBkHjJkl44Axs2QcMGaWjAPGzJKpaurYzSRdLmmJpBWSHpR0SL6sU1JIWll4nVVFXWaWViUzO+bbWUo2u+PTwKHAtZI+UFhnq4hYW1E9ZlaBSvZgImJVRMyJiMUR8XZE3AI8BUyvYvtm1hgNOQcjaRtgZ2BhoXmJpGckXSFpUiPqMrP6qjxgJG0KzAfmRcQiYDmwJzCFbI9mfL58oM/OktQjqae3t7eqks1sA1V1DgYASZsAPwLeAGYDRMRKoCdf5XlJs4HnJI2PiBXFz0dEN9AN0NXVFZUVbpXoPOPWRpdQ2uILDmt0CS2hsoCRJOByYBvg0Ih4cz2r9geHL6Gbtbgq92DmArsCn4iI1/obJe0FvAI8AbwbuBT4VUT0VVibmSVQ1X0wU4ATgD2AZYX7XY4GdgRuB1YAjwJrgKOqqMvM0qpkDyYilgAaZJWrq6jDzKrl8xxmlowDxsySccCYWTIOGDNLxgFjZsk4YMwsGQeMmSXjgDGzZBwwZpZMqYCRdLKkPfKfPyrpaUlPSdo7bXlm1srKDhX4J7KR0ADnAxeTjR26BNgrQV02BD/awFpB2YDZMiL6JI0HppGNiH5L0kUJazOzFlc2YJZK+hiwG/CbPFwmAG+lK83MWl3ZgDkNuJ7sSXQz87bDgd+lKKqefChh1jilAiYibgPeW9N8HXBt3Ssys7ZR9irSS7Vt+SMvn617RWbWNsreB7NpbUM+O8Co+pZjZu1k0EMkSXeRPYR7jKTf1CzeHrg3VWFm1vqGOgfzQ7JHXe7Jn+6DgSx0ngd+maguM2sDgwZMRMwDkPTbfJI0M7PSyl5FWiTpYLJZAcbVLDs7RWFm1vpKBYyk7wCfAxYAqwuLPLuima1X2RvtPg9Mi4ilKYsxs/ZS9jL1crLZF83MSisbMBcB8yXtLWnH4qvMhyVtJulySUskrZD0oKRDCssPkrRI0mpJC/KZIM2sxZU9RJqb/3l4TXtQ7ma70cBSYH/gaeBQ4FpJHwBWAjcCXwJ+CnwDuAb4aMnazKxJlb2KNKIn30XEKmBOoekWSU8B04GtgYURcR2ApDnAcklTfWncrLU15JGZkrYBdgYWkj0C4qH+ZXkYPZm3m1kLK3uZun/IwDtExH7D2WA+hmk+MC+/v2Yc0FuzWh8wfoDPzgJmAUyePHk4mzWzBih7DuaHNe+3Bb4IXDWcjUnaBPgR2XNlZufNK4EJNatOIHsk5zoiohvoBujq6vI9OGZNruw5mHm1bZJuAK4Avl7mOySJbDzTNsCh+eMeIDtMOr6w3lhgp7zdzFrYSM7B/AH44DDWnwvsCsyIiNcK7TcBu0uaKWkMcDbwsE/wmrW+sudgvlDTtAVwBPDbkp+fApwArAGWZTszAJwQEfMlzQS+Q3bIdT9wZJnvNbPmVvYczLE171eRPQvm22U+HBFLyB77sL7lvwCmlqzFzFpE2XMwB6YuxMzaT9k9GCS9HzgK2I7s/MvVEfFEqsLMrPWVfej3DOABssOYl4BdgB5Jn05Ym5m1uLJ7MOcBn4mIBf0Nkg4gOzF7c4K6zKwNlL1MvT1wV03b3Xm7mdmAygbMg8ApNW0n5+1mZgMqe4h0IvBTSf9I9tiFHcgenTkjVWFm1vqG89DvXcme0fJeshkd7y/c7m9m9g5l7+TdA3gxIu4utO0gaWJEPDTIR81sI1b2HMxVvHP62HeRjYw2MxtQ2YCZHBH/V2yIiCeBzrpXZGZto2zAPCPpw8WG/P2z9S/JzNpF2atI3wZ+IumbZI+z3Ak4FTg3VWFm1vrKXkX6gaRXyJ5itwPZpepTIuL6lMWZWWsrPdgxf+r/dQlrMbM205BZBcxs4+CAMbNkHDBmlowDxsySWe9JXkmlpiOJiLPrV46ZtZPBriLtUFkVZtaW1hswEfG3VRZiZu2n9H0wAJLGA5MoTEFSO0bJzKxf2cc1/AXZhPXTgCALmP65oUelKc3MWl3Zq0jfAxYAE4FXgXcDl1GYU9rMrFbZgJkGnB4RrwCKiD7gNOAbZTckabakHklrJF1ZaO+UFJJWFl5nDacTZtacyp6DeZ3sgVNvAsslTQZeBrYexraeBc4BPglsPsDyrSJi7TC+z8yaXNk9mLuAz+U/Xw/8DPg18MuyG4qIGyPix8CLw6rQzFpW2cc1fK7w9qvAo8B4YF4da1kiKYCfA6dFxPLaFSTNAmYBTJ48uY6bNrMUyk4de2r/zxHxdkRcFRFzgS/XoYblwJ7AFGA6WXDNH2jFiOiOiK6I6Oro6KjDps0spbKHSOsbDnDmSAuIiJUR0RMRayPieWA2cHB+z42ZtbBBD5EkfTz/cZSkAyncYAfsCKxIUFP//TUeiGnW4oY6B3N5/ucY4N8L7QEsA/6h7IYkjc63N4ossMYAa8kOi14BniC7v+ZS4Ff5pXAza2GDBkxE/DmApP+IiONGuK0zga8V3h8D/CvwGHAe8B6ym/h+Dhw1wm2ZWRMoexXpuHwP5GPAdsAzwH3DuW8lIuYAc9az+Oqy32NmraPsWKRdgFvIbpBbSvYoh9clzYiI3yesz8xaWNkTqXOBbmCHiNg7IrYHvk82RsnMbEBlA2YP4OKIiELbJXm7mdmAygbMs8D+NW374qljzWwQZQc7fhW4WdItwBKyu24PI7sSZGY2oFJ7MBFxM/Bh/jQG6VFgekT8JGFtZtbiyl5FOjUiLiR73EKx/eSIuDhJZWbW8ho+FsnM2lczjkUyszZR2VgkM9v4VDkWycw2MmWvIjlczGzY/MwVM0vGAWNmyThgzCwZB4yZJTOigJH0SL0KMbP2M9I9mPPrUoWZtaWy8yJtu55FpWd2NLONT9k9mMfX0/4/9SrEzNpP2YDROxqkCcDb9S3HzNrJUIMdl5KNO9pc0tM1i7fGswGY2SCGGux4DNney23AsYX2AJ6PiMdSFWZmrW+owY6/BpA0KSJWV1OSmbWLsudgrpK0b7FB0r6Sri+7IUmzJfVIWiPpypplB0laJGm1pAWSppT9XjNrXmUDZn/g3pq2+4ADh7GtZ8keuVl8rgySJgE3AmcBE4Ee4JphfK+ZNamyAfM6MLambRzwZtkNRcSNEfFj4MWaRUcACyPiuoh4nWx62WmSppb9bjNrTmUD5g7gsvzSdP8l6u8At9ehht2Ah/rfRMQq4Mm8fR2SZuWHWT29vb112LSZpVQ2YE4BJgAvSXoBeAnYEjipDjWMA/pq2vrIpkdZR0R0R0RXRHR1dHTUYdNmllKpaUsi4mXgsHzIwA7A0ohYVqcaVpKFV9EE/EBxs5Y31I12W5BNTbI78N/A+XUMln4LgeML2xwL7JS3m1kLG+oQ6bvADGAR8NfAhRu6IUmjJY0BRpFNgzJG0mjgJmB3STPz5WcDD0fEog3dlpk1h6EC5lPAwRHxz8AhwOEj2NaZwGvAGWR3CL8GnBkRvcBM4FzgZWAv4MgRbMfMmsRQ52DGRsRzABGxVNKWG7qhiJhDdgl6oGW/AHxZ2qzNDBUwo2tmdKx9T0T4mTBmNqChAuYF1r3z9kXeOcPjjvUuyszaw1CDHTsrqsPM2pBnFTCzZBwwZpaMA8bMknHAmFkyDhgzS8YBY2bJOGDMLBkHjJkl44Axs2QcMGaWjAPGzJJxwJhZMg4YM0vGAWNmyThgzCwZB4yZJeOAMbNkHDBmlowDxsySccCYWTIOGDNLpmkCRtKvJL0uaWX+eqzRNZnZyDRNwORmR8S4/LVLo4sxs5FptoAxszbSbAFzvqTlku6RdECjizGzkWmmgDmdbBra7YBu4KeSdiquIGmWpB5JPb29vY2o0cyGoWkCJiLuj4gVEbEmIuYB9wCH1qzTHRFdEdHV0dHRmELNrLSmCZgBBKBGF2FmG64pAkbSVpI+KWmMpNGSjgb2A25vdG1mtuFGN7qA3KbAOcBU4C1gEfDZiHi8oVWZ2Yg0RcBERC+wZ6PrMLP6aopDJDNrTw4YM0vGAWNmyThgzCwZB4yZJeOAMbNkHDBmlowDxsySccCYWTIOGDNLxgFjZsk4YMwsGQeMmSXjgDGzZBwwZpaMA8bMknHAmFkyDhgzS8YBY2bJOGDMLBkHjJkl44Axs2QcMGaWjAPGzJJpmoCRNFHSTZJWSVoi6fONrsnMRqYpZnbMfRd4A9gG2AO4VdJDEbGwsWWZ2YZqij0YSWOBmcBZEbEyIu4GbgaObWxlZjYSTREwwM7A2prJ7h8CdmtQPWZWB4qIRteApH2B6yJi20Lb3wFHR8QBhbZZwKz87S7AY1XWWWMSsLyB20+hHfsE7dmvRvdpSkR0DLVSs5yDWQlMqGmbAKwoNkREN9BdVVGDkdQTEV2NrqOe2rFP0J79apU+Ncsh0uPAaEnvL7RNA3yC16yFNUXARMQq4Ebg65LGStoH+Azwo8ZWZmYj0RQBk/sKsDnwAnA1cGKTX6JuikO1OmvHPkF79qsl+tQUJ3nNrD010x6MmbUZB4yZJeOAyUlaLOk1SSskvSLpXklflrRJvvxKSSHpI4XPvE/SO44x83XXSvqzKvuQb7u/HyslLctrGVeoa0R9kLS7pDskLR/ocylU0KfjJT0g6VVJz0j6pqTkt3BU0K8jJT0mqU/SC5LmSaq9HSQpB8y6ZkTEeGAKcAFwOnB5YflLwDmDfUFh2EMfcEyiOocyIyLGkY3p+hDwL4VlI+3Dm8C1wBfrVm05Kfu0BXAS2c1rewEHAafWp+whpezXPcA+EbElsCPZfW+Dfl+9OWAGEBF9EXEz8DfA8ZJ2zxfNAz4oaf9BPj4TeAX4OnB82koHFxHLgDvIfnn7jagPEfFYRFxOg+5RStSnuRFxV0S8ERF/AOYD+9S38sEl6tfSiCje7fsW8L76VFyOA2YQEfE74Blg37xpNXAecO4gHzue7DL7fwFTJU1PWuQgJG0PHAL8b6G5pfpQq6I+7UfFAZqqX5L+UlIf2V3xM4FL6ln3UBwwQ3sWmFh4fxkwWdIhtStKmgwcCPxnRDwP3AkcV0mV6/qxpBXAUrL7ir5Ws7wV+lCrkj5J+gLQBVxYx9oHk7RfEXF3foi0PfAtYHHdezAIB8zQtiM7FgYgItYA38hftY4Ffh8RD+bv5wOfl7Rp8irX9dn8XNIBwFSycwt/1CJ9qJW8T5I+C5wPHFJzaJFSJf9W+aHf7WR7OpVxwAxC0p5kAXN3zaIrgK2AI2rajwN2zK8ILAMuJvuFOTR1rQOJiF8DVzLw/8Yt0Ydaqfok6VPAD8hOuj6SoPRBVfRvNRrYqS4Fl9Qso6mbSn4pbz/g34CrIuIRSX9cHhFrJX0NuLTwmb3J/vE+BPQWvu4isl+En1RQ+kAuARZLmlZs3NA+KPuL2Ax4V/6ZMdnXxZqkvVhXvfv0cbL//f8qP+/WKPXu19HAXRHxtKQpZOdy7kzch3VFhF/ZcInFwGtkJ8P6gPuAvwdG5cuvBM4prL8J8Gj2VxgA3wduGOB7PwKsASZW2I9P1LTNBW6oRx+ATiBqXotbvE8LgLVkjw3pf/2sDf6tziW7SLEq/7Mb2LqK38P+l8cimVkyPgdjZsk4YMwsGQeMmSXjgDGzZBwwZpaMA8bMknHAmFkyDhgzS8YBY2bJ/D+kzR5R1OXl7QAAAABJRU5ErkJggg==\n", "text/plain": [ "<Figure size 288x288 with 1 Axes>" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" }, { "data": { "image/png": "\n", "text/plain": [ "<Figure size 288x288 with 1 Axes>" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAARgAAAD/CAYAAAAquMkCAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAEwpJREFUeJzt3XuUXWV9xvHvA0GDhIhAipVLUlCJBQ3KIKLlJhYFpFrTZVHkUrVRLF21ihVdiFkCYi0idalolAoVSgUEBVRcVUHB6xqqKKkBpQSCCEwEYhIgAj79Y+/Yk8NkZg9z3nPj+ax11sx59z7n/b0kPNn73TfZJiKihE16XUBEDK8ETEQUk4CJiGISMBFRTAImIopJwEREMQmYiCimawEj6XxJv5b0W0k3S3pzy7KDJC2T9ICkqyXN7VZdEVGOunWinaTdgF/aXidpPnANcBhwG3AL8GbgCuAUYF/bL+pKYRFRzIxudWR7aevb+rULsCew1PbFAJIWAyslzbe9bGPft+2223revHnlCo6Ijbr++utX2p4z2XpdCxgASZ8EjgU2B34MfBU4Dbhh/Tq210q6BdgN2GjAzJs3j9HR0aL1RsT4JN3WZL2uTvLafhuwJbAvcCmwDpgFrGpbdVW93gYkLZI0Kml0bGysdLkRMU1dP4pk+1Hb1wE7AMcBa4DZbavNBlaP89kltkdsj8yZM+nWWUT0WC8PU8+gmoNZCixY3yhpi5b2iBhgXQkYSX8k6QhJsyRtKunlwOuAbwKXAbtLWihpJnAy8NOJJngjYjB0awvGVLtDdwD3AWcAb7d9ue0xYCHVZO99wN7AEV2qKyIK6spRpDpE9p9g+TeA+d2oJSK6J5cKREQxCZiIKCYBExHFdPVM3l6Yd+JXel1CY8s/dFivS4joqGzBREQxCZiIKCYBExHFJGAiopgETEQUk4CJiGISMBFRTAImIooZ+hPtYnDkpMjhky2YiCgmARMRxSRgIqKYBExEFJOAiYhiEjARUUwCJiKKScBERDEJmIgoJgETEcUkYCKimARMRBSTgImIYroSMJKeLOkcSbdJWi3pJ5IOqZfNk2RJa1pe7+tGXRFRVrdu1zADWEH1fOrbgUOBiyQ9t2WdrWw/0qV6IqILurIFY3ut7cW2l9v+ve0rgVuBPbvRf0T0Rk/mYCRtBzwbWNrSfJukOyR9TtK2vagrIjqr6wEjaTPgAuA828uAlcBewFyqLZot6+XjfXaRpFFJo2NjY90qOSIep64GjKRNgM8DvwOOB7C9xvao7Uds3123Hyxpy/bP215ie8T2yJw5c7pZekQ8Dl27J68kAecA2wGH2n54I6u6/plD6BEDrps3/T4beA7wMtsPrm+UtDdwP/AL4GnAx4BrbK/qYm0RUUC3zoOZC7wF2AO4q+V8lyOBnYGrgNXAjcA64HXdqCsiyurKFozt2wBNsMqF3agjIror8xwRUUwCJiKKScBERDEJmIgoJgETEcUkYCKimARMRBTTzTN5o4PmnfiVXpfQ2PIPHdbrEqJHsgUTEcUkYCKimARMRBSTgImIYhIwEVFMAiYiiknAREQxCZiIKCYBExHFJGAiophGASPpHZL2qH9/kaTbJd0qaZ+y5UXEIGu6BfOPVI96BTgdOBM4FTirRFERMRyaXuz4VNur6oehLaB69Mijkj5SsLaIGHBNA2aFpBcDuwHfqcNlNvBoudIiYtA1DZh3AZdQPfJ1Yd32SuBHJYqKiOHQKGBsfxV4RlvzxcBFHa8oIoZG06NI97a31c+WvrPjFUXE0Gh6FGmz9gZJmwGbdraciBgmE+4iSboWMDBT0nfaFu8AfK9JJ5KeDHwSeBmwNXAL8B7bX6uXHwR8AtgJ+CFwbP242YiBNyi3Ny1xa9PJ5mA+S/VM6b2Ac1raDdwNfGsK/awA9gduBw4FLpL0XGANcCnwZuAK4BTgC8CLGn53RPSpCQPG9nkAkn5ge9nj7cT2WmBxS9OVkm4F9gS2AZbavrjuazGwUtL86fQZEb3X9CjSMkkHA3sAs9qWnTzVTiVtBzwbWAocB9zQ8n1rJd1Cdc5NAiZigDUKGEkfB14LXA080LLIU+2wnhy+ADivDq5ZwFjbaquALcf57CJgEcBOO+001a4josuanmj3emCB7RXT6UzSJsDnqU7YO75uXgPMblt1NrC6/fO2lwBLAEZGRqYcbhHRXU0PU68E7p9OR5JENVG8HbCwPo8Gqt2kBS3rbQHsUrdHxABrGjAfAS6QtI+knVtfU+jrbOA5wOG2H2xpvwzYXdJCSTOBk4GfZoI3YvA13UU6u/75yrZ20+BkO0lzgbcA64C7qo0ZAN5i+wJJC4GPA+dTnQdzRMO6IqKPNT2KNK0739UnzWmC5d8A5k+nj4joP7llZkQU0/Qw9fpLBh7D9n4drSgihkbTOZjPtr1/OvAmqjmTiIhxNZ2DOa+9TdIXgc8BH+h0URExHKYzB/Mr4HmdKiQihk/TOZg3tjU9BXgN8IOOVxQRQ6PpHMxRbe/XUt0L5qOdLScihknTOZgDSxcSEcOn6RYMkp4FvA7Ynmr+5ULbvyhVWEQMvqY3/T4cuJ7qbNt7gV2BUUl/UbC2iBhwTbdgPgi8yvbV6xskHUB1/dDlBeqKiCHQ9DD1DsC1bW3X1e0REeNqGjA/Ad7Z1vaOuj0iYlxNd5GOA66Q9A9UTwfYkerWmYeXKiwiBt9Ubvr9HKpHiTyD6omOP2y5K11ExGM0PZN3D+A3tq9radtR0ta2b5jgoxHxBNZ0DuZ8Hvv42CdR3cA7ImJcTQNmJ9v/29pg+xZgXscrioih0TRg7pD0gtaG+v2dnS8pIoZF06NIHwW+LOnDVA+u3wU4ATitVGERMfiaHkX6jKT7qe5ityPVoep32r6kZHERMdgaX+xYP5z+4oK1RMSQyVMFIqKYBExEFJOAiYhiEjARUcxGJ3klNXocie2Tm6wn6XjgWOC5VHfDO7ZunwfcSnWf3/X+2fYpTb43IvrXREeRduxwX3cCpwIvBzYfZ/lWth/pcJ8R0UMbDRjbf9PJjmxfCiBphNyoKuIJofF5MACStgS2BbS+rf0apWm4TZKB/wLeZXtlh743Inqk6U2//1TSj4FVwC/r1y/q13StBPYC5gJ7AlsCF2ykjkWSRiWNjo2NdaDriCip6VGkTwJXA1sDvwWeBnwaOGa6BdheY3vU9iO27waOBw6ut5ba111ie8T2yJw5c6bbdUQU1nQXaQHw57YfliTbqyS9C7iR6l4xneT6Zw6hRwy4pv8TP8T/33BqpaSd6s9u07QjSTMkzQQ2BTaVNLNu21vSrpI2kbQN8DHgGturpjCOiOhDTQPmWuC19e+XAF8Dvg18awp9nQQ8CJwIvKH+/SRgZ+AqYDXVFtE6qidIRsSAa3q7hte2vH0vVRBsCZzXtCPbi4HFG1l8YdPviYjB0fQo0gnrf7f9e9vn2z4beGuxyiJi4DXdRdrY5QAndaqQiBg+E+4iSXpp/eumkg6k5QQ7qrmT1aUKi4jBN9kczDn1z5nAv7W0G7gL+PsSRUXEcJgwYGz/CYCkf7d9dHdKiohh0fQo0tGSZgAvBrYH7gC+n6ufI2IiTR8duytwJdVtFlZQ3crhIUmH2/55wfoiYoA1PYp0NrAE2NH2PrZ3AD5FdY1SRMS4mgbMHsCZtt3SdlbdHhExrqYBcyewf1vbvuTRsRExgaZXU78XuFzSlcBtVPduOYzqmqKIiHE12oKxfTnwAv7/GqQbgT1tf7lgbREx4JoeRTrB9hlUN+1ubX+H7TOLVBYRAy/XIkVEMbkWKSKKybVIEVFMrkWKiGKaHkVKuETElOXO/RFRTAImIopJwEREMQmYiChmWgEj6WedKiQihs90t2BO70gVETGUmj4X6ekbWTSVJztGxBNM0y2YmzfS/j+dKiQihk/TgNFjGqTZwO+bdiTpeEmjktZJOrdt2UGSlkl6QNLVkuY2/d6I6F8TBoykFZJuBzaXdHvrC/g18KUp9HUn1e0eWq9pQtK2wKXA+4CtgVHgC1P43ojoU5Nd7PgGqq2XrwJHtbQbuNv2TU07sn0pgKQRYIeWRa8Bltq+uF6+GFgpab7tZU2/PyL6z2QXO34bqq0M2w8UqmE34IaWPtdKuqVuT8BEDLCmczDnS9q3tUHSvpIu6UANs4BVbW2rqG7NuQFJi+p5nNGxsbEOdB0RJTUNmP2B77W1fR84sAM1rAFmt7XNZpybWdleYnvE9sicOXM60HVElNQ0YB4CtmhrmwU83IEalgIL1r+RtAWwS90eEQOsacB8Hfh0fWh6/SHqjwNXNe1I0gxJM4FNqW7BObN+3vVlwO6SFtbLTwZ+mgneiMHXNGDeSbXbcq+ke4B7gacCb59CXycBDwInUh2dehA4yfYYsBA4DbgP2Bs4YgrfGxF9qtFjS2zfBxxWXzKwI7DC9l1T6cj2YmDxRpZ9A5g/le+LiP432VMFnkK15bE78N/A6VMNloh44ppsF+kTwOFU56P8FXBG8YoiYmhMFjCvAA62/U/AIcAry5cUEcNisoDZwvavAWyvoJrYjYhoZLJJ3hltT3Rsf4/t3BMmIsY1WcDcw4ZXP/+Gxz7hcedOFxURw2Gyix3ndamOiBhCeapARBSTgImIYhIwEVFMAiYiiknAREQxCZiIKCYBExHFJGAiopgETEQUk4CJiGISMBFRTAImIopJwEREMQmYiCgmARMRxSRgIqKYBExEFJOAiYhiEjARUUzfBIykayQ9JGlN/bqp1zVFxPT0TcDUjrc9q37t2utiImJ6+i1gImKI9FvAnC5ppaTvSjqg18VExPT0U8C8m+ohbtsDS4ArJO3SuoKkRZJGJY2OjY31osaImIK+CRjbP7S92vY62+cB3wUObVtnie0R2yNz5szpTaER0VjfBMw4TMszsCNi8PRFwEjaStLLJc2UNEPSkcB+wFW9ri0iHr8Jn03dRZsBpwLzgUeBZcCrbd/c06oiYlr6ImBsjwF79bqOiOisvthFiojhlICJiGISMBFRTAImIopJwEREMQmYiCgmARMRxSRgIqKYBExEFJOAiYhiEjARUUwCJiKKScBERDEJmIgoJgETEcUkYCKimARMRBSTgImIYhIwEVFMAiYiiknAREQxCZiIKCYBExHFJGAiopgETEQUk4CJiGL6JmAkbS3pMklrJd0m6fW9rikipqcvnk1d+wTwO2A7YA/gK5JusL20t2VFxOPVF1swkrYAFgLvs73G9nXA5cBRva0sIqajLwIGeDbwiO2bW9puAHbrUT0R0QGy3esakLQvcLHtp7e0/S1wpO0DWtoWAYvqt7sCN3WzzjbbAit72H8JwzgmGM5x9XpMc23PmWylfpmDWQPMbmubDaxubbC9BFjSraImImnU9kiv6+ikYRwTDOe4BmVM/bKLdDMwQ9KzWtoWAJngjRhgfREwttcClwIfkLSFpJcArwI+39vKImI6+iJgam8DNgfuAS4EjuvzQ9R9savWYcM4JhjOcQ3EmPpikjcihlM/bcFExJBJwEREMQmYmqTlkh6UtFrS/ZK+J+mtkjapl58ryZJe2PKZZ0p6zD5mve4jkv64m2Oo+14/jjWS7qprmdVS17TGIGl3SV+XtHK8z5XQhTEdI+l6Sb+VdIekD0sqfgpHF8Z1hKSbJK2SdI+k8yS1nw5SVAJmQ4fb3hKYC3wIeDdwTsvye4FTJ/qClsseVgFvKFTnZA63PYvqmq7nA+9pWTbdMTwMXAS8qWPVNlNyTE8B3k518trewEHACZ0pe1Ilx/Vd4CW2nwrsTHXe24Tf12kJmHHYXmX7cuCvgWMk7V4vOg94nqT9J/j4QuB+4APAMWUrnZjtu4CvU/3lXW9aY7B9k+1z6NE5SoXGdLbta23/zvavgAuAl3S28okVGtcK261n+z4KPLMzFTeTgJmA7R8BdwD71k0PAB8ETpvgY8dQHWb/T2C+pD2LFjkBSTsAhwC/bGkeqDG069KY9qPLAVpqXJL+TNIqqrPiFwJndbLuySRgJncnsHXL+08DO0k6pH1FSTsBBwL/Yftu4JvA0V2pckNfkrQaWEF1XtH725YPwhjadWVMkt4IjABndLD2iRQdl+3r6l2kHYB/AZZ3fAQTSMBMbnuqfWEAbK8DTqlf7Y4Cfm77J/X7C4DXS9qseJUbenU9l3QAMJ9qbuEPBmQM7YqPSdKrgdOBQ9p2LUrqyp9Vvet3FdWWTtckYCYgaS+qgLmubdHngK2A17S1Hw3sXB8RuAs4k+ovzKGlax2P7W8D5zL+v8YDMYZ2pcYk6RXAZ6gmXX9WoPQJdenPagawS0cKbqhfrqbuK/WhvP2AfwXOt/0zSX9YbvsRSe8HPtbymX2o/vCeD4y1fN1HqP4ifLkLpY/nLGC5pAWtjY93DKr+QzwZeFL9mZnV13ld0VFsqNNjeinVv/5/Wc+79Uqnx3UkcK3t2yXNpZrL+WbhMWzIdl7V5RLLgQepJsNWAd8H/g7YtF5+LnBqy/qbADdW/wkN8Cngi+N87wuBdcDWXRzHy9razga+2IkxAPMAt72WD/iYrgYeobptyPrX14bgz+o0qoMUa+ufS4BtuvH3cP0r1yJFRDGZg4mIYhIwEVFMAiYiiknAREQxCZiIKCYBExHFJGAiopgETEQUk4CJiGL+D1A3OrgYfvUZAAAAAElFTkSuQmCC\n", "text/plain": [ "<Figure size 288x288 with 1 Axes>" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Now process data for the Polylinker (experiment is in Fig 4, but it is easier to process the data here)\n", "# Process data for the Rho promoter: convert counts into activity scores for each sequence\n", "library_names = [\"library1\", \"library2\"]\n", "polylinker_activity_data = {} # {library name: pd.DataFrame}\n", "barcode_count_dir = os.path.join(data_dir, \"Polylinker\")\n", "\n", "for library in library_names:\n", " print(f\"Processing data for {library} with the Polylinker...\")\n", " # File names\n", " barcode_count_files = [\n", " os.path.join(barcode_count_dir, f\"{library}{sample}.counts\")\n", " for sample in [\"Plasmid\", \"Rna1\", \"Rna2\", \"Rna3\"]\n", " ]\n", " \n", " # Masks and metadata for downstream functions\n", " sample_labels = np.array([\"DNA\", \"RNA1\", \"RNA2\", \"RNA3\"])\n", " sample_rna_mask = np.array([False, True, True, True])\n", " rna_labels = sample_labels[sample_rna_mask]\n", " dna_labels = sample_labels[np.logical_not(sample_rna_mask)]\n", " n_samples = len(sample_labels)\n", " n_rna_samples = len(rna_labels)\n", " n_dna_samples = len(dna_labels)\n", " n_barcodes_per_sequence = 3\n", " \n", " # Read in the barcode counts\n", " print(\"Reading in barcode counts.\")\n", " all_sample_counts_df = quality_control.read_bc_count_files(barcode_count_files, sample_labels)\n", " display(all_sample_counts_df.head())\n", " \n", " # Remove barcodes that are detection-limited.\n", " print(\"Removing barcodes missing from the DNA pool and normalizing to counts per million.\")\n", " cutoffs_dna_only = [50, 0, 0, 0]\n", " # Barcodes below the DNA cutoff are NaN (because they are missing from the input plasmid pool)\n", " # Barcodes below any of the RNA cutoffs are zero in all replicates\n", " print(\"Removing detection-limited barcodes and normalizing to counts per million.\")\n", " threshold_sample_counts_df = quality_control.filter_low_counts(all_sample_counts_df, sample_labels, cutoffs_dna_only,\n", " dna_labels=dna_labels, bc_per_seq=n_barcodes_per_sequence)\n", " print(\"Now removing RNA barcodes missing from any replicate.\")\n", " cutoffs_rna_cpm = [0, 8, 8, 8]\n", " threshold_sample_counts_df = quality_control.filter_low_counts(threshold_sample_counts_df, sample_labels, cutoffs_rna_cpm,\n", " dna_labels=dna_labels, bc_per_seq=n_barcodes_per_sequence, cpm_normalize=False)\n", " display(threshold_sample_counts_df.head())\n", "\n", " # Normalize RNA barcode counts by plasmid barcode counts\n", " print(\"Normalizing RNA to DNA.\")\n", " normalized_sample_counts_df = quality_control.normalize_rna_by_dna(threshold_sample_counts_df, rna_labels, dna_labels)\n", " # Drop DNA\n", " barcode_sample_counts_df = normalized_sample_counts_df.drop(columns=dna_labels)\n", " \n", " # Average across barcodes\n", " print(\"Averaging across barcodes within a replicate.\")\n", " activity_replicate_df = quality_control.average_barcodes(barcode_sample_counts_df)\n", " display(activity_replicate_df.head())\n", " \n", " # Drop \"basal\" and average across replicates\n", " print(\"Removing the 'basal' promoter (Polylinker) and averaging across replicates. No statistical analysis is performed here.\")\n", " activity_replicate_df = activity_replicate_df.drop(index=\"BASAL\")\n", " sequence_expression_df = activity_replicate_df.apply(lambda x: pd.Series({\"expression\": x.mean(), \"expression_SEM\": x.sem()}), axis=1)\n", " print(f\"Done processing data!\")\n", " display(sequence_expression_df.head())\n", " \n", " polylinker_activity_data[library] = sequence_expression_df" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "caption": "### Reproducibility of massively parallel reporter assay (MPRA) measurements.\n\nEach row represents a different library and experiment. For each column, the first replicate in the title is the x-axis and the second replicate is the y-axis.", "id": "fig1s1", "label": "Figure 1—figure supplement 1." }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "<Figure size 576x576 with 16 Axes>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# File names of the raw barcode counts\n", "raw_data_files = [os.path.join(data_dir, dirname, filename) for dirname, filename in itertools.product([\"Rhodopsin\", \"Polylinker\"], [\"library1RawBarcodeCounts.txt\", \"library2RawBarcodeCounts.txt\"])]\n", "raw_data_names = [\"Library 1\\n+Rho\", \"Library 2\\n+Rho\", \"Library 1\\n+Polylinker\", \"Library 2\\n+Polylinker\"]\n", "comparison_columns = [\"Rep 1 vs 2\", \"Rep 1 vs 3\", \"Rep 2 vs 3\"]\n", "fig, ax_list = plt.subplots(nrows=4, ncols=3, figsize=(8, 8))\n", "\n", "# Read in each dataset\n", "for row, filename in enumerate(raw_data_files):\n", " row_df = pd.read_csv(filename, sep=\"\\t\")\n", " # Get all 3 pairs of combinations and plot them\n", " for col, (x, y) in enumerate(itertools.combinations([\"RNA1\", \"RNA2\", \"RNA3\"], 2)):\n", " rsquared = stats.pearsonr(row_df[x], row_df[y])[0] ** 2\n", " ax = ax_list[row, col]\n", " ax.scatter(row_df[x] / 1000, row_df[y] / 1000, color=\"k\")\n", " ax.text(0.02, 0.98, fr\"$r^2$={rsquared:.2f}\", transform=ax.transAxes, ha=\"left\", va=\"top\")\n", " max_value = max(ax.get_xlim()[1], ax.get_ylim()[1])\n", " ax.set_xlim(right=max_value)\n", " ax.set_ylim(top=max_value)\n", " \n", "# Add \"axis\" labels\n", "fig.text(0.5, 0.025, \"Raw barcode counts (thousands)\", ha=\"center\", va=\"top\", fontsize=14)\n", "fig.text(0.025, 0.5, \"Raw barcode counts (thousands)\", rotation=90, ha=\"right\", va=\"center\", fontsize=14)\n", "\n", "# Add column labels at the top\n", "for col, text in enumerate(comparison_columns):\n", " ax_list[0, col].set_title(text)\n", " \n", "# Add row labels on the right\n", "for row, text in enumerate(raw_data_names):\n", " twinax = ax_list[row, 2].twinx()\n", " twinax.set_ylabel(text)\n", " twinax.set_yticks([])\n", " \n", "display(fig)\n", "plt.close()" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "caption": "### Calibration of massively parallel reporter assay (MPRA) libraries with the _Rho_ promoter.\n\nProbability density histogram of the same 150 scrambled sequences in two libraries after normalizing to the basal _Rho_ promoter.", "id": "fig1s2", "label": "Figure 1—figure supplement 2." }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Scrambled sequences from L1 and L2 are drawn from the same distribution, KS test p = 0.087, D = 0.14\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "<Figure size 288x288 with 1 Axes>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "library1_rho_df = rho_activity_data[\"library1\"]\n", "library1_rho_df[\"library\"] = 1\n", "library2_rho_df = rho_activity_data[\"library2\"]\n", "library2_rho_df[\"library\"] = 2\n", "\n", "# Get scrambled sequences from each library with RNA barcodes measured\n", "scrambled_library1_df = library1_rho_df[library1_rho_df.index.str.contains(\"scrambled\") & (library1_rho_df[\"expression\"] > 0)]\n", "scrambled_library2_df = library2_rho_df[library2_rho_df.index.str.contains(\"scrambled\") & (library2_rho_df[\"expression\"] > 0)]\n", "\n", "# Compare distributions of log2 expression\n", "scrambled_library1_expr = np.log2(scrambled_library1_df[\"expression\"])\n", "scrambled_library2_expr = np.log2(scrambled_library2_df[\"expression\"])\n", "ks_stat, pval = stats.ks_2samp(scrambled_library1_expr, scrambled_library2_expr)\n", "print(f\"Scrambled sequences from L1 and L2 are drawn from the same distribution, KS test p = {pval:.3f}, D = {ks_stat:.2f}\")\n", "\n", "# Show the two histograms\n", "fig, ax = plt.subplots()\n", "ax.hist([scrambled_library2_expr, scrambled_library1_expr], bins=\"auto\", histtype=\"stepfilled\", density=True, label=[\"library 2\", \"library 1\"], color=plot_utils.set_color([0.75, 0.25]), alpha=0.5)\n", "ax.set_xlabel(\"log2 Scrambled Activity/Rho\")\n", "ax.set_ylabel(\"Density\")\n", "ax.legend(loc=\"upper left\", frameon=False)\n", "display(fig)\n", "plt.close()" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Joining together data from the two libraries with the Rho promoter.\n", "Annotating sequences as strong enhancer, weak enhancer, inactive, silencer, or ambiguous.\n", "Cutoff to call something a strong enhancer: activity is above 2.84\n", "Joining together data from the two libraries with the Polylinker promoter and annotate for autonomous activity.\n", "Computing the effect size upon mutating CRX motifs in the presence of the Rho promoter.\n", "This is for Figure 5, but it is easier to do it here.\n", "Joining Rho and Polylinker data together.\n", "Annotating sequences for binding patterns.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/home/ryan/miniconda/envs/bclab/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py:879: RuntimeWarning: invalid value encountered in greater\n", " return (self.a < x) & (x < self.b)\n", "/home/ryan/miniconda/envs/bclab/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py:879: RuntimeWarning: invalid value encountered in less\n", " return (self.a < x) & (x < self.b)\n", "/home/ryan/miniconda/envs/bclab/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py:1821: RuntimeWarning: invalid value encountered in less_equal\n", " cond2 = cond0 & (x <= self.a)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Done processing and annotating data. This table corresponds to Supplementary file 3.\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>expression_WT</th>\n", " <th>expression_std_WT</th>\n", " <th>expression_reps_WT</th>\n", " <th>expression_pvalue_WT</th>\n", " <th>expression_qvalue_WT</th>\n", " <th>library_WT</th>\n", " <th>expression_log2_WT</th>\n", " <th>group_name_WT</th>\n", " <th>plot_color_WT</th>\n", " <th>variant_WT</th>\n", " <th>...</th>\n", " <th>wt_vs_mut_pvalue</th>\n", " <th>wt_vs_mut_qvalue</th>\n", " <th>expression_POLY</th>\n", " <th>expression_SEM_POLY</th>\n", " <th>expression_log2_POLY</th>\n", " <th>autonomous_activity</th>\n", " <th>crx_bound</th>\n", " <th>nrl_bound</th>\n", " <th>mef2d_bound</th>\n", " <th>binding_group</th>\n", " </tr>\n", " <tr>\n", " <th>label</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>chr1-104768570-104768734_UPCQ</th>\n", " <td>3.606621</td>\n", " <td>0.297412</td>\n", " <td>3.0</td>\n", " <td>0.001206</td>\n", " <td>0.003548</td>\n", " <td>1</td>\n", " <td>1.851048</td>\n", " <td>Weak enhancer</td>\n", " <td>#a6cee3</td>\n", " <td>WT</td>\n", " <td>...</td>\n", " <td>0.092328</td>\n", " <td>0.147455</td>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " <td>-6.643856</td>\n", " <td>False</td>\n", " <td>False</td>\n", " <td>False</td>\n", " <td>False</td>\n", " <td>No binding</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-106008207-106008371_CPPE</th>\n", " <td>2.068611</td>\n", " <td>0.944664</td>\n", " <td>3.0</td>\n", " <td>0.080583</td>\n", " <td>0.103242</td>\n", " <td>1</td>\n", " <td>1.049360</td>\n", " <td>NaN</td>\n", " <td>grey</td>\n", " <td>WT</td>\n", " <td>...</td>\n", " <td>0.145377</td>\n", " <td>0.212937</td>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " <td>-6.643856</td>\n", " <td>False</td>\n", " <td>True</td>\n", " <td>False</td>\n", " <td>False</td>\n", " <td>CRX only</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-106696554-106696718_CPPE</th>\n", " <td>8.261201</td>\n", " <td>1.317719</td>\n", " <td>3.0</td>\n", " <td>0.000008</td>\n", " <td>0.000217</td>\n", " <td>1</td>\n", " <td>3.046526</td>\n", " <td>Strong enhancer</td>\n", " <td>#1f78b4</td>\n", " <td>WT</td>\n", " <td>...</td>\n", " <td>0.003104</td>\n", " <td>0.013211</td>\n", " <td>0.795621</td>\n", " <td>0.058574</td>\n", " <td>-0.311827</td>\n", " <td>False</td>\n", " <td>True</td>\n", " <td>False</td>\n", " <td>False</td>\n", " <td>CRX only</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-118321635-118321799_CPPP</th>\n", " <td>1.368148</td>\n", " <td>0.397835</td>\n", " <td>3.0</td>\n", " <td>0.166861</td>\n", " <td>0.196017</td>\n", " <td>1</td>\n", " <td>0.453279</td>\n", " <td>Inactive</td>\n", " <td>#33a02c</td>\n", " <td>WT</td>\n", " <td>...</td>\n", " <td>0.080966</td>\n", " <td>0.132766</td>\n", " <td>0.000000</td>\n", " <td>0.000000</td>\n", " <td>-6.643856</td>\n", " <td>False</td>\n", " <td>True</td>\n", " <td>False</td>\n", " <td>False</td>\n", " <td>CRX only</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-118589610-118589774_UPCE</th>\n", " <td>0.184993</td>\n", " <td>0.077742</td>\n", " <td>3.0</td>\n", " <td>0.019478</td>\n", " <td>0.031968</td>\n", " <td>1</td>\n", " <td>-2.426678</td>\n", " <td>Silencer</td>\n", " <td>#e31a1c</td>\n", " <td>WT</td>\n", " <td>...</td>\n", " <td>0.005790</td>\n", " <td>0.019789</td>\n", " <td>0.308888</td>\n", " <td>0.138871</td>\n", " <td>-1.648877</td>\n", " <td>False</td>\n", " <td>False</td>\n", " <td>False</td>\n", " <td>False</td>\n", " <td>No binding</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "<p>5 rows × 31 columns</p>\n", "</div>" ], "text/plain": [ " expression_WT expression_std_WT \\\n", "label \n", "chr1-104768570-104768734_UPCQ 3.606621 0.297412 \n", "chr1-106008207-106008371_CPPE 2.068611 0.944664 \n", "chr1-106696554-106696718_CPPE 8.261201 1.317719 \n", "chr1-118321635-118321799_CPPP 1.368148 0.397835 \n", "chr1-118589610-118589774_UPCE 0.184993 0.077742 \n", "\n", " expression_reps_WT expression_pvalue_WT \\\n", "label \n", "chr1-104768570-104768734_UPCQ 3.0 0.001206 \n", "chr1-106008207-106008371_CPPE 3.0 0.080583 \n", "chr1-106696554-106696718_CPPE 3.0 0.000008 \n", "chr1-118321635-118321799_CPPP 3.0 0.166861 \n", "chr1-118589610-118589774_UPCE 3.0 0.019478 \n", "\n", " expression_qvalue_WT library_WT \\\n", "label \n", "chr1-104768570-104768734_UPCQ 0.003548 1 \n", "chr1-106008207-106008371_CPPE 0.103242 1 \n", "chr1-106696554-106696718_CPPE 0.000217 1 \n", "chr1-118321635-118321799_CPPP 0.196017 1 \n", "chr1-118589610-118589774_UPCE 0.031968 1 \n", "\n", " expression_log2_WT group_name_WT \\\n", "label \n", "chr1-104768570-104768734_UPCQ 1.851048 Weak enhancer \n", "chr1-106008207-106008371_CPPE 1.049360 NaN \n", "chr1-106696554-106696718_CPPE 3.046526 Strong enhancer \n", "chr1-118321635-118321799_CPPP 0.453279 Inactive \n", "chr1-118589610-118589774_UPCE -2.426678 Silencer \n", "\n", " plot_color_WT variant_WT ... \\\n", "label ... \n", "chr1-104768570-104768734_UPCQ #a6cee3 WT ... \n", "chr1-106008207-106008371_CPPE grey WT ... \n", "chr1-106696554-106696718_CPPE #1f78b4 WT ... \n", "chr1-118321635-118321799_CPPP #33a02c WT ... \n", "chr1-118589610-118589774_UPCE #e31a1c WT ... \n", "\n", " wt_vs_mut_pvalue wt_vs_mut_qvalue \\\n", "label \n", "chr1-104768570-104768734_UPCQ 0.092328 0.147455 \n", "chr1-106008207-106008371_CPPE 0.145377 0.212937 \n", "chr1-106696554-106696718_CPPE 0.003104 0.013211 \n", "chr1-118321635-118321799_CPPP 0.080966 0.132766 \n", "chr1-118589610-118589774_UPCE 0.005790 0.019789 \n", "\n", " expression_POLY expression_SEM_POLY \\\n", "label \n", "chr1-104768570-104768734_UPCQ 0.000000 0.000000 \n", "chr1-106008207-106008371_CPPE 0.000000 0.000000 \n", "chr1-106696554-106696718_CPPE 0.795621 0.058574 \n", "chr1-118321635-118321799_CPPP 0.000000 0.000000 \n", "chr1-118589610-118589774_UPCE 0.308888 0.138871 \n", "\n", " expression_log2_POLY autonomous_activity \\\n", "label \n", "chr1-104768570-104768734_UPCQ -6.643856 False \n", "chr1-106008207-106008371_CPPE -6.643856 False \n", "chr1-106696554-106696718_CPPE -0.311827 False \n", "chr1-118321635-118321799_CPPP -6.643856 False \n", "chr1-118589610-118589774_UPCE -1.648877 False \n", "\n", " crx_bound nrl_bound mef2d_bound binding_group \n", "label \n", "chr1-104768570-104768734_UPCQ False False False No binding \n", "chr1-106008207-106008371_CPPE True False False CRX only \n", "chr1-106696554-106696718_CPPE True False False CRX only \n", "chr1-118321635-118321799_CPPP True False False CRX only \n", "chr1-118589610-118589774_UPCE False False False No binding \n", "\n", "[5 rows x 31 columns]" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Join and annotate all data\n", "print(\"Joining together data from the two libraries with the Rho promoter.\")\n", "color_mapping = {\n", " \"Strong enhancer\": \"#1f78b4\",\n", " \"Weak enhancer\": \"#a6cee3\",\n", " \"Inactive\": \"#33a02c\",\n", " \"Silencer\": \"#e31a1c\",\n", " np.nan: \"grey\"\n", "}\n", "\n", "# Join the libraries and add a pseudocount to take log2\n", "rho_df = library1_rho_df.append(library2_rho_df)\n", "rho_pseudocount = 1e-3\n", "rho_df[\"expression_log2\"] = np.log2(rho_df[\"expression\"] + rho_pseudocount)\n", "\n", "# Define cutoff for a strong enhancer based on scrambled sequences\n", "print(\"Annotating sequences as strong enhancer, weak enhancer, inactive, silencer, or ambiguous.\")\n", "scrambled_mask = rho_df.index.str.contains(\"scrambled\")\n", "scrambled_df = rho_df[scrambled_mask]\n", "scrambled_df = scrambled_df[scrambled_df[\"expression\"].notna()]\n", "strong_cutoff = scrambled_df[\"expression_log2\"].quantile(0.95)\n", "print(f\"Cutoff to call something a strong enhancer: activity is above {strong_cutoff:.2f}\")\n", "\n", "# Drop scrambled sequences\n", "rho_df = rho_df[~scrambled_mask]\n", "\n", "# Helper function to label and color a sequence\n", "def label_color_sequence(row, alpha=0.05, strong_cutoff=strong_cutoff, inactive_cutoff=1, color_mapping=color_mapping):\n", " expr_log2 = row[\"expression_log2\"]\n", " qval = row[\"expression_qvalue\"]\n", " # Inactive\n", " if (np.abs(expr_log2) <= inactive_cutoff) & (qval >= alpha):\n", " group = \"Inactive\"\n", " # Silencer\n", " elif (expr_log2 < -inactive_cutoff) & ((qval < alpha) | (row[\"expression\"] == 0)):\n", " group = \"Silencer\"\n", " # Enhancer\n", " elif (expr_log2 > inactive_cutoff) & (qval < alpha):\n", " # Strong\n", " if expr_log2 > strong_cutoff:\n", " group = \"Strong enhancer\"\n", " # Weak\n", " else:\n", " group = \"Weak enhancer\"\n", " # Ambiguous\n", " else:\n", " group = np.nan\n", " \n", " color = color_mapping[group]\n", " return pd.Series({\"group_name\": group, \"plot_color\": color})\n", "\n", "# Annotate both WT and MUT sequences\n", "rho_df = rho_df.join(rho_df.apply(label_color_sequence, axis=1))\n", "rho_df[\"group_name\"] = sequence_annotation_processing.to_categorical(rho_df[\"group_name\"])\n", "\n", "# Now do Polylinker data\n", "library1_poly_df = polylinker_activity_data[\"library1\"]\n", "library2_poly_df = polylinker_activity_data[\"library2\"]\n", "print(\"Joining together data from the two libraries with the Polylinker promoter and annotate for autonomous activity.\")\n", "poly_df = library1_poly_df.append(library2_poly_df)\n", "poly_pseudocount = 1e-2\n", "poly_df[\"expression_log2\"] = np.log2(poly_df[\"expression\"] + poly_pseudocount)\n", "poly_df[\"autonomous_activity\"] = (poly_df[\"expression_log2\"] > 0)\n", "\n", "# Compute effect of mutating CRX motifs in the presence of the Rho promoter.\n", "print(\"Computing the effect size upon mutating CRX motifs in the presence of the Rho promoter.\")\n", "print(\"This is for Figure 5, but it is easier to do it here.\")\n", "wt_mask = rho_df.index.str.contains(\"_WT$\")\n", "mut_mask = rho_df.index.str.contains(\"_MUT-allCrxSites$\")\n", "\n", "# Add variant info as a column, then trim it off the index\n", "rho_df_no_variant_df = rho_df.copy()\n", "rho_df_no_variant_df[\"variant\"] = rho_df_no_variant_df.index.str.split(\"_\").str[2:].str.join(\"_\")\n", "rho_df_no_variant_df = sequence_annotation_processing.remove_mutations_from_seq_id(rho_df_no_variant_df)\n", "\n", "# Separate out WT and MUT, then join them together on the same row\n", "wt_df = rho_df_no_variant_df[wt_mask]\n", "mut_df = rho_df_no_variant_df[mut_mask]\n", "wt_vs_mut_rho_df = wt_df.join(mut_df, lsuffix=\"_WT\", rsuffix=\"_MUT\")\n", "wt_vs_mut_rho_df[\"wt_vs_mut_log2\"] = wt_vs_mut_rho_df[\"expression_log2_WT\"] - wt_vs_mut_rho_df[\"expression_log2_MUT\"]\n", "\n", "# Compute parameters for lognormal distribution to do stats\n", "wt_cov = wt_vs_mut_rho_df[\"expression_std_WT\"] / wt_vs_mut_rho_df[\"expression_WT\"]\n", "wt_log_mean = np.log(wt_vs_mut_rho_df[\"expression_WT\"] / np.sqrt(wt_cov**2 + 1))\n", "wt_log_std = np.sqrt(np.log(wt_cov**2 + 1))\n", "mut_cov = wt_vs_mut_rho_df[\"expression_std_MUT\"] / wt_vs_mut_rho_df[\"expression_MUT\"]\n", "mut_log_mean = np.log(wt_vs_mut_rho_df[\"expression_MUT\"] / np.sqrt(mut_cov**2 + 1))\n", "mut_log_std = np.sqrt(np.log(mut_cov**2 + 1))\n", "\n", "# Do t-tests and FDR\n", "wt_vs_mut_rho_df[\"wt_vs_mut_pvalue\"] = stats.ttest_ind_from_stats(wt_log_mean, wt_log_std, wt_vs_mut_rho_df[\"expression_reps_WT\"], mut_log_mean, mut_log_std, wt_vs_mut_rho_df[\"expression_reps_MUT\"], equal_var=False)[1]\n", "wt_vs_mut_rho_df[\"wt_vs_mut_qvalue\"] = modeling.fdr(wt_vs_mut_rho_df[\"wt_vs_mut_pvalue\"])\n", "\n", "# Pull out WT polylinker measurements\n", "print(\"Joining Rho and Polylinker data together.\")\n", "poly_wt_df = poly_df.copy()\n", "poly_wt_df = poly_wt_df[poly_wt_df.index.str.contains(\"WT\")]\n", "\n", "# Drop the variant ID\n", "poly_wt_df = poly_wt_df.rename(index=lambda x: x[:-3], columns={\"expression\": \"expression_POLY\", \"expression_SEM\": \"expression_SEM_POLY\", \"expression_log2\": \"expression_log2_POLY\"})\n", "\n", "# Join with Rho\n", "activity_df = wt_vs_mut_rho_df.join(poly_wt_df)\n", "\n", "print(\"Annotating sequences for binding patterns.\")\n", "# Get info on CRX binding from the seq ID strings\n", "activity_df[\"crx_bound\"] = activity_df.index.str.contains(\"_C...$\")\n", "\n", "# Read in BED files\n", "library_bed = BedTool(os.path.join(data_dir, \"library1And2.bed\"))\n", "nrl_chip_bed = BedTool(os.path.join(\"Data\", \"Downloaded\", \"ChIP\", \"nrlPeaksMm10.bed\"))\n", "mef2d_chip_bed = BedTool(os.path.join(\"Data\", \"Downloaded\", \"ChIP\", \"mef2dPeaksMm10.bed\"))\n", "\n", "# Get binding patterns for NRL and MEF2D\n", "library_nrl_bound_df = library_bed.intersect(nrl_chip_bed, wa=True).to_dataframe()\n", "activity_df[\"nrl_bound\"] = activity_df.index.isin(library_nrl_bound_df[\"name\"])\n", "\n", "library_mef2d_bound_df = library_bed.intersect(mef2d_chip_bed, wa=True).to_dataframe()\n", "activity_df[\"mef2d_bound\"] = activity_df.index.isin(library_mef2d_bound_df[\"name\"])\n", "\n", "# Helper function to \"reverse one hot encode\" binding patterns\n", "def annotate_binding(row):\n", " crx, nrl, mef2d = row[[\"crx_bound\", \"nrl_bound\", \"mef2d_bound\"]]\n", " if crx:\n", " if nrl:\n", " if mef2d:\n", " return \"All three\"\n", " else:\n", " return \"CRX+NRL\"\n", " elif mef2d:\n", " return \"CRX+MEF2D\"\n", " else:\n", " return \"CRX only\"\n", " elif nrl:\n", " if mef2d:\n", " return \"NRL+MEF2D\"\n", " else:\n", " return \"NRL only\"\n", " elif mef2d:\n", " return \"MEF2D only\"\n", " else:\n", " return \"No binding\"\n", "\n", "activity_df[\"binding_group\"] = activity_df.apply(annotate_binding, axis=1)\n", "print(\"Done processing and annotating data. This table corresponds to Supplementary file 3.\")\n", "display(activity_df.head())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Strong enhancers and silencers have high CRX motif content\n", "\n", "The _cis_-regulatory activities of CRX-targeted sequences vary widely ([Figure 1a](#fig1)). We defined enhancers and silencers as those sequences that have statistically significant activity that is at least twofold above or below the activity of the basal _Rho_ promoter (Welch’s t-test, Benjamini-Hochberg false discovery rate (FDR) q < 0.05, [Supplementary file 3](#supp3)). We defined inactive sequences as those whose activity is both within a twofold change of basal activity and not significantly different from the basal _Rho_ promoter. We further stratified enhancers into strong and weak enhancers based on whether or not they fell above the 95th percentile of scrambled sequences. Using these criteria, 22% of CRX-targeted sequences are strong enhancers, 28% are weak enhancers, 19% are inactive, and 17% are silencers; the remaining 13% were considered ambiguous and removed from further analysis. To test whether these sequences function as CRX-dependent enhancers and silencers in the genome, we examined genes differentially expressed in _Crx^-/-^_ retina [@bib71]. Genes that are de-repressed are more likely to be near silencers (Fisher’s exact test p = 0.001, odds ratio = 2.1, n = 206) and genes that are down-regulated are more likely to be near enhancers (Fisher’s exact test p = 0.02, odds ratio = 1.5, n = 344, Materials and methods), suggesting that our reporter assay identified sequences that act as enhancers and silencers in the genome. We sought to identify features that would accurately classify these different classes of sequences." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Computing predicted occupancy of 8 TFs on every WT and mutant sequence. This might take 2-3 minutes.\n", "Done computing predicted occupancies. This corresponds to Supplementary table 4.\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>CRX</th>\n", " <th>GFI1</th>\n", " <th>MAZ</th>\n", " <th>MEF2D</th>\n", " <th>NDF1</th>\n", " <th>NRL</th>\n", " <th>RORB</th>\n", " <th>RAX</th>\n", " </tr>\n", " <tr>\n", " <th>label</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>chr1-4357766-4357930_CPPP_WT</th>\n", " <td>2.297972</td>\n", " <td>1.871720e-01</td>\n", " <td>2.204502e-08</td>\n", " <td>1.421229e-06</td>\n", " <td>3.064604e-07</td>\n", " <td>1.001505</td>\n", " <td>2.370847e-02</td>\n", " <td>0.005755</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-4357766-4357930_CPPP_MUT-allCrxSites</th>\n", " <td>0.239708</td>\n", " <td>3.783122e-11</td>\n", " <td>2.204502e-08</td>\n", " <td>1.421229e-06</td>\n", " <td>3.064606e-07</td>\n", " <td>1.411916</td>\n", " <td>2.340304e-02</td>\n", " <td>0.004416</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-73826292-73826456_CPPE_WT</th>\n", " <td>2.290427</td>\n", " <td>6.397380e-03</td>\n", " <td>5.577725e-03</td>\n", " <td>1.815852e-09</td>\n", " <td>6.713635e-07</td>\n", " <td>0.993418</td>\n", " <td>2.922269e-04</td>\n", " <td>0.000004</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-73826292-73826456_CPPE_MUT-allCrxSites</th>\n", " <td>0.293410</td>\n", " <td>1.203730e-08</td>\n", " <td>5.577725e-03</td>\n", " <td>6.339047e-11</td>\n", " <td>6.713632e-07</td>\n", " <td>0.993414</td>\n", " <td>1.239630e-07</td>\n", " <td>0.000002</td>\n", " </tr>\n", " <tr>\n", " <th>chr11-87108697-87108861_CPPP_WT</th>\n", " <td>2.718470</td>\n", " <td>6.025624e-01</td>\n", " <td>2.744230e-12</td>\n", " <td>2.986062e-06</td>\n", " <td>6.477337e-07</td>\n", " <td>0.040965</td>\n", " <td>4.672926e-05</td>\n", " <td>0.190641</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " CRX GFI1 \\\n", "label \n", "chr1-4357766-4357930_CPPP_WT 2.297972 1.871720e-01 \n", "chr1-4357766-4357930_CPPP_MUT-allCrxSites 0.239708 3.783122e-11 \n", "chr1-73826292-73826456_CPPE_WT 2.290427 6.397380e-03 \n", "chr1-73826292-73826456_CPPE_MUT-allCrxSites 0.293410 1.203730e-08 \n", "chr11-87108697-87108861_CPPP_WT 2.718470 6.025624e-01 \n", "\n", " MAZ MEF2D \\\n", "label \n", "chr1-4357766-4357930_CPPP_WT 2.204502e-08 1.421229e-06 \n", "chr1-4357766-4357930_CPPP_MUT-allCrxSites 2.204502e-08 1.421229e-06 \n", "chr1-73826292-73826456_CPPE_WT 5.577725e-03 1.815852e-09 \n", "chr1-73826292-73826456_CPPE_MUT-allCrxSites 5.577725e-03 6.339047e-11 \n", "chr11-87108697-87108861_CPPP_WT 2.744230e-12 2.986062e-06 \n", "\n", " NDF1 NRL \\\n", "label \n", "chr1-4357766-4357930_CPPP_WT 3.064604e-07 1.001505 \n", "chr1-4357766-4357930_CPPP_MUT-allCrxSites 3.064606e-07 1.411916 \n", "chr1-73826292-73826456_CPPE_WT 6.713635e-07 0.993418 \n", "chr1-73826292-73826456_CPPE_MUT-allCrxSites 6.713632e-07 0.993414 \n", "chr11-87108697-87108861_CPPP_WT 6.477337e-07 0.040965 \n", "\n", " RORB RAX \n", "label \n", "chr1-4357766-4357930_CPPP_WT 2.370847e-02 0.005755 \n", "chr1-4357766-4357930_CPPP_MUT-allCrxSites 2.340304e-02 0.004416 \n", "chr1-73826292-73826456_CPPE_WT 2.922269e-04 0.000004 \n", "chr1-73826292-73826456_CPPE_MUT-allCrxSites 1.239630e-07 0.000002 \n", "chr11-87108697-87108861_CPPP_WT 4.672926e-05 0.190641 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Calculate predicted occupancy of all TFs\n", "print(\"Computing predicted occupancy of 8 TFs on every WT and mutant sequence. This might take 2-3 minutes.\")\n", "\n", "# Load in PWMs\n", "pwms = predicted_occupancy.read_pwm_files(os.path.join(\"Data\", \"Downloaded\", \"Pwm\", \"photoreceptorAndEnrichedMotifs.meme\"))\n", "pwms = pwms.rename(lambda x: x.split(\"_\")[0])\n", "# Reverse compliment RAX for display purposes\n", "rax = pwms[\"RAX\"].copy()\n", "rax = rax[::-1].reset_index(drop=True)\n", "rax_rc = rax.copy()\n", "rax_rc[\"A\"] = rax[\"T\"]\n", "rax_rc[\"C\"] = rax[\"G\"]\n", "rax_rc[\"G\"] = rax[\"C\"]\n", "rax_rc[\"T\"] = rax[\"A\"]\n", "pwms[\"RAX\"] = rax_rc\n", "motif_len = pwms.apply(len)\n", "ewms = pwms.apply(predicted_occupancy.ewm_from_letter_prob).apply(predicted_occupancy.ewm_to_dict)\n", "mu = 9\n", "\n", "# Do predicted occupancy scans\n", "occupancy_df = predicted_occupancy.all_seq_total_occupancy(all_seqs, ewms, mu, convert_ewm=False)\n", "print(\"Done computing predicted occupancies. This corresponds to Supplementary table 4.\")\n", "display(occupancy_df.head())\n", "\n", "# Separate out the WT sequences\n", "wt_occupancy_df = occupancy_df[occupancy_df.index.str.contains(\"WT$\")].copy()\n", "wt_occupancy_df = sequence_annotation_processing.remove_mutations_from_seq_id(wt_occupancy_df)\n", "wt_occupancy_df = wt_occupancy_df.loc[activity_df.index]\n", "n_tfs = len(wt_occupancy_df.columns)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Computing information content of sequences.\n", "Done computing information content and related metrics. This corresponds to Supplementary table 5.\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>total_occupancy</th>\n", " <th>diversity</th>\n", " <th>entropy</th>\n", " </tr>\n", " <tr>\n", " <th>label</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>chr1-4357766-4357930_CPPP_WT</th>\n", " <td>3.516114</td>\n", " <td>2.0</td>\n", " <td>2.291861</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-4357766-4357930_CPPP_MUT-allCrxSites</th>\n", " <td>1.679445</td>\n", " <td>1.0</td>\n", " <td>0.440493</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-73826292-73826456_CPPE_WT</th>\n", " <td>3.296117</td>\n", " <td>2.0</td>\n", " <td>1.743370</td>\n", " </tr>\n", " <tr>\n", " <th>chr1-73826292-73826456_CPPE_MUT-allCrxSites</th>\n", " <td>1.292404</td>\n", " <td>1.0</td>\n", " <td>0.378922</td>\n", " </tr>\n", " <tr>\n", " <th>chr11-87108697-87108861_CPPP_WT</th>\n", " <td>3.552689</td>\n", " <td>2.0</td>\n", " <td>1.867968</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " total_occupancy diversity \\\n", "label \n", "chr1-4357766-4357930_CPPP_WT 3.516114 2.0 \n", "chr1-4357766-4357930_CPPP_MUT-allCrxSites 1.679445 1.0 \n", "chr1-73826292-73826456_CPPE_WT 3.296117 2.0 \n", "chr1-73826292-73826456_CPPE_MUT-allCrxSites 1.292404 1.0 \n", "chr11-87108697-87108861_CPPP_WT 3.552689 2.0 \n", "\n", " entropy \n", "label \n", "chr1-4357766-4357930_CPPP_WT 2.291861 \n", "chr1-4357766-4357930_CPPP_MUT-allCrxSites 0.440493 \n", "chr1-73826292-73826456_CPPE_WT 1.743370 \n", "chr1-73826292-73826456_CPPE_MUT-allCrxSites 0.378922 \n", "chr11-87108697-87108861_CPPP_WT 1.867968 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "print(\"Computing information content of sequences.\")\n", "entropy_df = occupancy_df.apply(predicted_occupancy.boltzmann_entropy, axis=1)\n", "print(\"Done computing information content and related metrics. This corresponds to Supplementary table 5.\")\n", "display(entropy_df.head())\n", "\n", "wt_entropy_df = entropy_df[entropy_df.index.str.contains(\"WT$\")].copy()\n", "wt_entropy_df = sequence_annotation_processing.remove_mutations_from_seq_id(wt_entropy_df)\n", "wt_entropy_df = wt_entropy_df.loc[activity_df.index]\n", "\n", "mut_entropy_df = entropy_df[entropy_df.index.str.contains(\"MUT\")].copy()\n", "mut_entropy_df = sequence_annotation_processing.remove_mutations_from_seq_id(mut_entropy_df)\n", "mut_entropy_df = mut_entropy_df.loc[activity_df.index]" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "caption": "### Activity of putative _cis_-regulatory sequences with cone-rod homeobox (CRX) motifs.\n\n(**a**) Volcano plot of activity scores relative to the _Rho_ promoter alone. Sequences are grouped as strong enhancers (dark blue), weak enhancers (light blue), inactive (green), silencers (red), or ambiguous (gray). Horizontal line, false discovery rate (FDR) q = 0.05. Vertical lines, twofold above and below _Rho_. (**b**) Fraction of ChIP-seq and ATAC-seq peaks that belong to each activity group. (**c**) Predicted CRX occupancy of each activity group. Horizontal lines, medians; enh., enhancer. Numbers at top of (**b and c**) indicate n for groups.", "id": "fig1", "label": "Figure 1." }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Frequency of each activity bin in WT sequences:\n" ] }, { "data": { "text/plain": [ "Silencer 0.173615\n", "Inactive 0.192491\n", "Weak enhancer 0.282099\n", "Strong enhancer 0.218005\n", "NaN 0.133790\n", "Name: group_name_WT, dtype: float64" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Frequency of activity bins vs. CRX binding status:\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th>group_name_WT</th>\n", " <th>Silencer</th>\n", " <th>Inactive</th>\n", " <th>Weak enhancer</th>\n", " <th>Strong enhancer</th>\n", " </tr>\n", " <tr>\n", " <th>crx_bound</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>ATAC-seq</th>\n", " <td>281</td>\n", " <td>363</td>\n", " <td>430</td>\n", " <td>211</td>\n", " </tr>\n", " <tr>\n", " <th>ChIP-seq</th>\n", " <td>556</td>\n", " <td>565</td>\n", " <td>930</td>\n", " <td>840</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ "group_name_WT Silencer Inactive Weak enhancer Strong enhancer\n", "crx_bound \n", "ATAC-seq 281 363 430 211\n", "ChIP-seq 556 565 930 840" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "ChIP-seq status is independent of if a sequence is inactive, Fisher's exact test p=2e-07, odds ratio=1.49\n", "ChIP-seq status is independent of if a sequence is inactive, Fisher's exact test p=1e-21, odds ratio=2.16\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th>group_name_WT</th>\n", " <th>Silencer</th>\n", " <th>Inactive</th>\n", " <th>Weak enhancer</th>\n", " <th>Strong enhancer</th>\n", " </tr>\n", " <tr>\n", " <th>crx_bound</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>ATAC-seq</th>\n", " <td>0.218677</td>\n", " <td>0.282490</td>\n", " <td>0.334630</td>\n", " <td>0.164202</td>\n", " </tr>\n", " <tr>\n", " <th>ChIP-seq</th>\n", " <td>0.192321</td>\n", " <td>0.195434</td>\n", " <td>0.321688</td>\n", " <td>0.290557</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ "group_name_WT Silencer Inactive Weak enhancer Strong enhancer\n", "crx_bound \n", "ATAC-seq 0.218677 0.282490 0.334630 0.164202\n", "ChIP-seq 0.192321 0.195434 0.321688 0.290557" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Predicted CRX occupancies:\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>count</th>\n", " <th>mean</th>\n", " <th>std</th>\n", " <th>min</th>\n", " <th>25%</th>\n", " <th>50%</th>\n", " <th>75%</th>\n", " <th>max</th>\n", " </tr>\n", " <tr>\n", " <th>group_name_WT</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>Silencer</th>\n", " <td>837.0</td>\n", " <td>2.822068</td>\n", " <td>1.474613</td>\n", " <td>0.013521</td>\n", " <td>1.598510</td>\n", " <td>2.724195</td>\n", " <td>3.916786</td>\n", " <td>8.028408</td>\n", " </tr>\n", " <tr>\n", " <th>Inactive</th>\n", " <td>928.0</td>\n", " <td>2.232489</td>\n", " <td>1.342345</td>\n", " <td>0.001052</td>\n", " <td>1.173444</td>\n", " <td>2.048457</td>\n", " <td>3.136282</td>\n", " <td>6.759976</td>\n", " </tr>\n", " <tr>\n", " <th>Weak enhancer</th>\n", " <td>1360.0</td>\n", " <td>2.216861</td>\n", " <td>1.220496</td>\n", " <td>0.000385</td>\n", " <td>1.235126</td>\n", " <td>2.113810</td>\n", " <td>2.988673</td>\n", " <td>7.801177</td>\n", " </tr>\n", " <tr>\n", " <th>Strong enhancer</th>\n", " <td>1051.0</td>\n", " <td>2.534010</td>\n", " <td>1.169460</td>\n", " <td>0.003694</td>\n", " <td>1.616414</td>\n", " <td>2.490314</td>\n", " <td>3.285321</td>\n", " <td>7.368500</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " count mean std min 25% 50% \\\n", "group_name_WT \n", "Silencer 837.0 2.822068 1.474613 0.013521 1.598510 2.724195 \n", "Inactive 928.0 2.232489 1.342345 0.001052 1.173444 2.048457 \n", "Weak enhancer 1360.0 2.216861 1.220496 0.000385 1.235126 2.113810 \n", "Strong enhancer 1051.0 2.534010 1.169460 0.003694 1.616414 2.490314 \n", "\n", " 75% max \n", "group_name_WT \n", "Silencer 3.916786 8.028408 \n", "Inactive 3.136282 6.759976 \n", "Weak enhancer 2.988673 7.801177 \n", "Strong enhancer 3.285321 7.368500 " ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Strong enhancers and inactive sequences have the same CRX occupancy, Mann-Whitney U test p = 6e-10 U = 566045.00\n", "Silencers and inactive sequences have the same CRX occupancy, Mann-Whitney U test p = 6e-17, U = 477843.00\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "<Figure size 432x576 with 5 Axes>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Mapping activity class to a color\n", "color_mapping = {\n", " \"Silencer\": \"#e31a1c\",\n", " \"Inactive\": \"#33a02c\",\n", " \"Weak enhancer\": \"#a6cee3\",\n", " \"Strong enhancer\": \"#1f78b4\",\n", " np.nan: \"grey\"\n", "}\n", "color_mapping = pd.Series(color_mapping)\n", "\n", "# Sort order for the four activity bins\n", "class_sort_order = [\"Silencer\", \"Inactive\", \"Weak enhancer\", \"Strong enhancer\"]\n", "activity_df[\"group_name_WT\"] = sequence_annotation_processing.to_categorical(activity_df[\"group_name_WT\"])\n", "activity_df[\"group_name_MUT\"] = sequence_annotation_processing.to_categorical(activity_df[\"group_name_MUT\"])\n", "rho_ticks = np.arange(-10, 7, 2)\n", "\n", "# We can only plot points that were detected in DNA\n", "activity_measured_wt_df = activity_df[activity_df[\"expression_log2_WT\"].notna()]\n", "print(\"Frequency of each activity bin in WT sequences:\")\n", "display(activity_measured_wt_df[\"group_name_WT\"].value_counts(normalize=True, dropna=False, sort=False))\n", "\n", "# Count frequency of activity bins for CRX bound/unbound\n", "crx_bound_grouper = activity_df.groupby(\"crx_bound\")\n", "chip_activity_bin_freqs = crx_bound_grouper[\"group_name_WT\"].value_counts().unstack()\n", "chip_activity_bin_freqs = chip_activity_bin_freqs[class_sort_order].rename(index=lambda x: \"ChIP-seq\" if x else \"ATAC-seq\")\n", "\n", "# Different ways to format group names\n", "chip_group_names_with_n = [f\"{i}\\nn={j.sum()}\" for i, j in chip_activity_bin_freqs.iterrows()]\n", "chip_group_names_with_n_oneline = [\" \".join(i.split()) for i in chip_group_names_with_n]\n", "chip_group_names = chip_activity_bin_freqs.index.values\n", "chip_group_count = [j.sum() for i, j in chip_activity_bin_freqs.iterrows()]\n", "\n", "# Display the data behind Fig 1b\n", "print(\"Frequency of activity bins vs. CRX binding status:\")\n", "display(chip_activity_bin_freqs)\n", "\n", "# Test if CRX binding and inactive status is independent\n", "chip_group_inactive_counts = crx_bound_grouper[\"group_name_WT\"].apply(lambda x: (x == \"Inactive\").value_counts()).unstack()\n", "oddsratio, pval = stats.fisher_exact(chip_group_inactive_counts)\n", "# Take inverse of odds ratio to match language of manuscript and be more intuitive to the reader\n", "print(f\"ChIP-seq status is independent of if a sequence is inactive, Fisher's exact test p={pval:.0e}, odds ratio={1/oddsratio:.2f}\")\n", "\n", "# Same for strong enhancer\n", "chip_group_inactive_counts = crx_bound_grouper[\"group_name_WT\"].apply(lambda x: (x == \"Strong enhancer\").value_counts()).unstack()\n", "oddsratio, pval = stats.fisher_exact(chip_group_inactive_counts)\n", "# Take inverse of odds ratio to match language of manuscript and be more intuitive to the reader\n", "print(f\"ChIP-seq status is independent of if a sequence is inactive, Fisher's exact test p={pval:.0e}, odds ratio={oddsratio:.2f}\")\n", "\n", "# Row-normalize the counts\n", "chip_activity_bin_freqs = chip_activity_bin_freqs.div(chip_activity_bin_freqs.sum(axis=1), axis=0)\n", "display(chip_activity_bin_freqs)\n", "\n", "# Setup for some downstream stuff\n", "wt_activity_grouper = activity_df.groupby(\"group_name_WT\")\n", "wt_activity_names_oneline = [\"Silencer\", \"Inactive\", \"Weak enh.\", \"Strong enh.\"]\n", "wt_activity_count = [len(j) for i, j in wt_activity_grouper]\n", "\n", "# Predicted CRX occupancy vs. WT group\n", "wt_occupancy_grouper = wt_occupancy_df.groupby(activity_df[\"group_name_WT\"])\n", "wt_occupancy_grouper_crx = wt_occupancy_grouper[\"CRX\"]\n", "print(\"Predicted CRX occupancies:\")\n", "display(wt_occupancy_grouper_crx.describe())\n", "\n", "# Statistics for differences in CRX occupancy between groups\n", "ustat, pval = stats.mannwhitneyu(wt_occupancy_grouper_crx.get_group(\"Strong enhancer\"), wt_occupancy_grouper_crx.get_group(\"Inactive\"), alternative=\"two-sided\")\n", "print(f\"Strong enhancers and inactive sequences have the same CRX occupancy, Mann-Whitney U test p = {pval:.0e} U = {ustat:.2f}\")\n", "ustat, pval = stats.mannwhitneyu(wt_occupancy_grouper_crx.get_group(\"Silencer\"), wt_occupancy_grouper_crx.get_group(\"Inactive\"), alternative=\"two-sided\")\n", "print(f\"Silencers and inactive sequences have the same CRX occupancy, Mann-Whitney U test p = {pval:.0e}, U = {ustat:.2f}\")\n", "\n", "# Generate the figure\n", "gs_kw = dict(width_ratios=[1, 3])\n", "fig, ax_list = plt.subplots(nrows=2, ncols=2, figsize=(6, 8), gridspec_kw=gs_kw)\n", "gs = ax_list[0, 0].get_gridspec()\n", "for ax in ax_list[0, :]:\n", " ax.remove()\n", " \n", "axbig = fig.add_subplot(gs[0, :])\n", "ax = axbig\n", "\n", "# 1a: Volcano plot\n", "fig = plot_utils.volcano_plot(activity_measured_wt_df, \"expression_log2_WT\", \"expression_qvalue_WT\",\n", " activity_measured_wt_df[\"plot_color_WT\"], xaxis_label=\"log2 Enhancer Activity/Rho\",\n", " yaxis_label=\"-log10 FDR\", xline=-np.log10(0.05), yline=[-1, 1],\n", " xticks=rho_ticks[1:], figax=(fig, ax))\n", "ax.set_yticks(np.arange(5))\n", "plot_utils.add_letter(ax, -0.125, 1, \"a\")\n", "\n", "# 1b: CRX binding status vs. activity classes\n", "ax = ax_list[1, 0]\n", "fig = plot_utils.stacked_bar_plots(chip_activity_bin_freqs, \"Fraction of group\", chip_group_names, color_mapping, figax=(fig, ax), vert=True)\n", "ax.set_yticks(np.linspace(0, 1, 6))\n", "plot_utils.rotate_ticks(ax.get_xticklabels()) \n", "\n", "# Add ticks above to show the n\n", "ax_twin = ax.twiny()\n", "ax_twin.set_xticks(ax.get_xticks())\n", "ax_twin.set_xlim(ax.get_xlim())\n", "ax_twin.set_xticklabels(chip_group_count, fontsize=10, rotation=45)\n", "plot_utils.add_letter(ax, -0.7, 1.03, \"b\")\n", "\n", "# 1c: Predicted CRX occupancy of different groups\n", "ax = ax_list[1, 1]\n", "fig = plot_utils.violin_plot_groupby(wt_occupancy_grouper_crx, \"Predicted CRX occupancy\", class_names=wt_activity_names_oneline, class_colors=color_mapping, figax=(fig, ax))\n", "ax.set_yticks(np.linspace(0, 8, 5))\n", "plot_utils.rotate_ticks(ax.get_xticklabels())\n", "\n", "# Add ticks above to show the n\n", "ax_twin = ax.twiny()\n", "ax_twin.set_xticks(ax.get_xticks())\n", "ax_twin.set_xlim(ax.get_xlim())\n", "ax_twin.set_xticklabels(wt_activity_count, fontsize=10, rotation=45)\n", "plot_utils.add_letter(ax, -0.2, 1.03, \"c\")\n", "fig.tight_layout()\n", "display(fig)\n", "plt.close()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Neither CRX ChIP-seq-binding status nor DNA accessibility as measured by ATAC-seq strongly differentiates between these four classes ([Figure 1b](#fig1)). Compared to CRX ChIP-seq peaks, ATAC-seq peaks that lack CRX binding in the adult retina are slightly enriched for inactive sequences (Fisher’s exact test p = 2 × 10^–7^, odds ratio = 1.5) and slightly depleted for strong enhancers (Fisher’s exact test p = 1 × 10^–21^, odds ratio = 2.2). However, sequences with ChIP-seq or ATAC-seq peaks span all four activity categories, consistent with prior reports that DNA accessibility and TF binding data are not sufficient to identify functional enhancers and silencers [@bib11; @bib29; @bib30; @bib62; @bib85].\n", "\n", "We examined whether the number and affinity of CRX motifs differentiate enhancers, silencers, and inactive sequences by computing the predicted CRX occupancy (i.e. expected number of bound molecules) for each sequence [@bib85]. Consistent with our previous work [@bib86], both strong enhancers and silencers have higher predicted CRX occupancy than inactive sequences (Mann-Whitney U test, p = 6 × 10^–10^ and 6 × 10^–17^, respectively, [Figure 1c](#fig1)), suggesting that total CRX motif content helps distinguish silencers and strong enhancers from inactive sequences. However, predicted CRX occupancy does not distinguish strong enhancers from silencers: a logistic regression classifier trained with fivefold cross-validation only achieves an area under the receiver operating characteristic (AUROC) curve of 0.548 ± 0.023 and an area under the precision recall (AUPR) curve of 0.571 ± 0.020 ([Figure 2a](#fig2) and [Figure 2—figure supplement 1](#fig2ab)). We thus sought to identify sequence features that distinguish strong enhancers from silencers." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Fitting k-mer Supper Vector Machine. This will take a few minutes.\n", "Fitting strong enhancer vs. silencer logistic regression model for CRX occupancy.\n", "Fitting strong enhancer vs. silencer logistic regression model for 8 TFs.\n", "Optimal regularization strength (C): 1.0e-02\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "<Figure size 288x288 with 1 Axes>" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Prepare data for fitting models\n", "# Mask to pull out the silencers and strong enhancers\n", "silencer_modeling_mask = activity_df[\"group_name_WT\"].str.contains(\"Strong|Silencer\")\n", "silencer_modeling_mask = silencer_modeling_mask & silencer_modeling_mask.notna()\n", "# Mask to pull out the inactive seqs and the strong enhancers\n", "inactive_modeling_mask = activity_df[\"group_name_WT\"].str.contains(\"Strong|Inactive\")\n", "inactive_modeling_mask = inactive_modeling_mask & inactive_modeling_mask.notna()\n", "\n", "# Within the data to model, mask indicating which sequences are strong enhancers\n", "labels_with_silencer = activity_df.loc[silencer_modeling_mask, \"group_name_WT\"].str.contains(\"Strong\")\n", "labels_with_inactive = activity_df.loc[inactive_modeling_mask, \"group_name_WT\"].str.contains(\"Strong\")\n", "\n", "# Write strong enhancers and silencers to file for the SVM\n", "seq_bins_dir = os.path.join(data_dir, \"ActivityBins\")\n", "positives_fasta = os.path.join(seq_bins_dir, \"strongEnhancer.fasta\")\n", "negatives_fasta = os.path.join(seq_bins_dir, \"silencer.fasta\")\n", "all_strong_mask = activity_df[\"group_name_WT\"].str.contains(\"Strong\")\n", "all_strong_mask = all_strong_mask & all_strong_mask.notna()\n", "strong_ids = activity_df.loc[all_strong_mask, \"variant_WT\"]\n", "fasta_seq_parse_manip.write_fasta(all_seqs[strong_ids.index + \"_\" + strong_ids], positives_fasta)\n", "all_silencer_mask = activity_df[\"group_name_WT\"].str.contains(\"Silencer\")\n", "all_silencer_mask = all_silencer_mask & all_silencer_mask.notna()\n", "silencer_ids = activity_df.loc[all_silencer_mask, \"variant_WT\"]\n", "fasta_seq_parse_manip.write_fasta(all_seqs[silencer_ids.index + \"_\" + silencer_ids], negatives_fasta)\n", "\n", "# Fit k-mer SVM\n", "print(\"Fitting k-mer Supper Vector Machine. This will take a few minutes.\")\n", "# Hyperparameter setup\n", "seed = 1210\n", "word_len = 6\n", "max_mis = 1\n", "nfolds = 5\n", "\n", "models_dir = \"Models\"\n", "svm_dir = os.path.join(models_dir, \"StrongEnhancerVsSilencer\")\n", "if not os.path.exists(svm_dir):\n", " os.makedirs(svm_dir)\n", "\n", "# Fit the SVM\n", "svm_prefix = os.path.join(svm_dir, f\"gkmsvm_{word_len}_{word_len}_{max_mis}\")\n", "fig_list, xaxis, svm_tpr, svm_prec, svm_f1, svm_scores = gkmsvm.train_with_cv(positives_fasta, negatives_fasta, svm_prefix, num_folds=nfolds, word_len=word_len, info_pos=word_len, max_mis=max_mis, seed=seed)\n", "plt.close()\n", "\n", "# Fit logistic regression models\n", "print(\"Fitting strong enhancer vs. silencer logistic regression model for CRX occupancy.\")\n", "cv = StratifiedKFold(n_splits=nfolds, shuffle=True, random_state=seed)\n", "crx_clf = LogisticRegression()\n", "crx_clf, crx_tpr_list, crx_prec_list, crx_f1_list = modeling.train_estimate_variance(crx_clf, cv, wt_occupancy_df.loc[silencer_modeling_mask, \"CRX\"], labels_with_silencer, xaxis, positive_cutoff=0)\n", "\n", "print(\"Fitting strong enhancer vs. silencer logistic regression model for 8 TFs.\")\n", "occ_clf = LogisticRegression()\n", "param_grid = {\"C\": np.logspace(-4, 4, 9)}\n", "np.random.seed(seed)\n", "occ_clf, occ_tpr_list, occ_prec_list = modeling.grid_search_hyperparams(occ_clf, nfolds, param_grid, \"f1\", wt_occupancy_df[silencer_modeling_mask], labels_with_silencer, xaxis, positive_cutoff=0)\n", "c_opt = occ_clf.get_params()[\"C\"]\n", "print(f\"Optimal regularization strength (C): {c_opt:1.1e}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "figure: Figure 2.\n", ":::\n", "### Strong enhancers contain a diverse array of motifs.\n", ":::\n", "{#fig2}" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "caption": "### Figure 2\n\n(**a**) Receiver operating characteristic for classifying strong enhancers from silencers. Solid black, 6-mer support vector machine (SVM); orange, eight transcription factors (TFs) predicted occupancy logistic regression; aqua, predicted cone-rod homeobox (CRX) occupancy logistic regression; dashed black, chance; shaded area, 1 standard deviation based on fivefold cross-validation. (**b**) Total predicted TF occupancy in each activity class.\n\n### Figure 2-figure supplement 1. Precision recall curve for strong enhancer vs. silencer classifiers.\n\nSolid black, 6-mer support vector machine (SVM); orange, eight transcription factors (TFs) predicted occupancy logistic regression; aqua, predicted cone-rod homeobox (CRX) occupancy logistic regression; dashed black, chance; shaded area, 1 standard deviation based on fivefold cross-validation.", "id": "fig2ab", "label": "Figure 2a and b, and Figure 2—figure supplement 1" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Model metrics:\n", "SVM\tAUROC=0.781+/-0.013\tAUPR=0.812+/-0.020\n", "8 TFs\tAUROC=0.698+/-0.036\tAUPR=0.745+/-0.032\n", "CRX\tAUROC=0.548+/-0.023\tAUPR=0.571+/-0.020\n", "Total predicted occupancy of all TFs in each group:\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>count</th>\n", " <th>mean</th>\n", " <th>std</th>\n", " <th>min</th>\n", " <th>25%</th>\n", " <th>50%</th>\n", " <th>75%</th>\n", " <th>max</th>\n", " </tr>\n", " <tr>\n", " <th>group_name_WT</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>Silencer</th>\n", " <td>837.0</td>\n", " <td>3.588419</td>\n", " <td>1.848387</td>\n", " <td>0.067069</td>\n", " <td>2.167386</td>\n", " <td>3.408131</td>\n", " <td>4.845272</td>\n", " <td>11.848887</td>\n", " </tr>\n", " <tr>\n", " <th>Inactive</th>\n", " <td>928.0</td>\n", " <td>3.005903</td>\n", " <td>1.690368</td>\n", " <td>0.034470</td>\n", " <td>1.777625</td>\n", " <td>2.810142</td>\n", " <td>3.968906</td>\n", " <td>12.011682</td>\n", " </tr>\n", " <tr>\n", " <th>Weak enhancer</th>\n", " <td>1360.0</td>\n", " <td>3.068334</td>\n", " <td>1.582532</td>\n", " <td>0.010029</td>\n", " <td>1.935493</td>\n", " <td>2.921969</td>\n", " <td>4.031018</td>\n", " <td>12.521734</td>\n", " </tr>\n", " <tr>\n", " <th>Strong enhancer</th>\n", " <td>1051.0</td>\n", " <td>3.782727</td>\n", " <td>1.622289</td>\n", " <td>0.021160</td>\n", " <td>2.577761</td>\n", " <td>3.664645</td>\n", " <td>4.762179</td>\n", " <td>10.185356</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " count mean std min 25% 50% \\\n", "group_name_WT \n", "Silencer 837.0 3.588419 1.848387 0.067069 2.167386 3.408131 \n", "Inactive 928.0 3.005903 1.690368 0.034470 1.777625 2.810142 \n", "Weak enhancer 1360.0 3.068334 1.582532 0.010029 1.935493 2.921969 \n", "Strong enhancer 1051.0 3.782727 1.622289 0.021160 2.577761 3.664645 \n", "\n", " 75% max \n", "group_name_WT \n", "Silencer 4.845272 11.848887 \n", "Inactive 3.968906 12.011682 \n", "Weak enhancer 4.031018 12.521734 \n", "Strong enhancer 4.762179 10.185356 " ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Figure 2, panels A and B:\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "<Figure size 576x288 with 3 Axes>" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Figure 2--figure supplement 1:\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "<Figure size 288x288 with 1 Axes>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Generate the figure -- this has to be done in a few pieces\n", "modeling_xaxis = np.linspace(0, 1, 100)\n", "fig, ax_list = plot_utils.setup_multiplot(2, sharex=False, sharey=False)\n", "# Separate figure handle for the PR curves\n", "fig_pr, ax_pr = plt.subplots()\n", "\n", "# 2a and supplemental figure 3: ROC and PR curves with SVM, TF occupancies, CRX occupancy\n", "model_data = [ # (TPR, precision, name, color)\n", " (svm_tpr, svm_prec, \"SVM\", \"black\"),\n", " (occ_tpr_list, occ_prec_list, f\"{n_tfs} TFs\", \"#E69B04\"),\n", " (crx_tpr_list, crx_prec_list, \"CRX\", \"#009980\")\n", "]\n", "\n", "model_tprs, model_precs, model_names, model_colors = zip(*model_data)\n", "prc_chance = activity_df[\"group_name_WT\"].str.contains(\"Strong\").sum() / activity_df[\"group_name_WT\"].str.contains(\"Strong|Silencer\").sum()\n", "\n", "# Generate figures\n", "_, model_aurocs, model_aurocs_std, model_auprs, model_auprs_std = plot_utils.roc_pr_curves(\n", " modeling_xaxis, model_tprs, model_precs, model_names, model_colors=model_colors,\n", " prc_chance=prc_chance, figax=([fig, fig_pr], [ax_list[0], ax_pr])\n", ")\n", "ax_list[0].set_xticks(np.linspace(0, 1, 6))\n", "plot_utils.add_letter(ax_list[0], -0.25, 1.03, \"a\")\n", "\n", "# Display model metrics\n", "print(\"Model metrics:\")\n", "for name, auroc, auroc_std, aupr, aupr_std in zip(model_names, model_aurocs, model_aurocs_std, model_auprs, model_auprs_std):\n", " print(f\"{name}\\tAUROC={auroc:.3f}+/-{auroc_std:.3f}\\tAUPR={aupr:.3f}+/-{aupr_std:.3f}\")\n", "\n", "# Calculate total predicted occupancy of each class\n", "wt_entropy_grouper = wt_entropy_df.groupby(activity_df[\"group_name_WT\"])\n", "print(\"Total predicted occupancy of all TFs in each group:\")\n", "display(wt_entropy_grouper[\"total_occupancy\"].describe())\n", "\n", "# 2b: Total predicted occupancy of each class\n", "ax = ax_list[1]\n", "fig = plot_utils.violin_plot_groupby(wt_entropy_grouper[\"total_occupancy\"], \"Total predicted TF occupancy\", class_names=wt_activity_names_oneline, class_colors=color_mapping, figax=(fig, ax))\n", "plot_utils.rotate_ticks(ax.get_xticklabels())\n", "plot_utils.add_letter(ax, -0.25, 1.03, \"b\")\n", "\n", "# Add ticks above to show the n\n", "ax_twin = ax.twiny()\n", "ax_twin.set_xticks(ax.get_xticks())\n", "ax_twin.set_xlim(ax.get_xlim())\n", "ax_twin.set_xticklabels(wt_activity_count, fontsize=10, rotation=45)\n", "\n", "print(\"Figure 2, panels A and B:\")\n", "fig.tight_layout()\n", "display(fig)\n", "print(\"Figure 2--figure supplement 1:\")\n", "display(fig_pr)\n", "plt.close()\n", "plt.close()" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "caption": "(**c**) Frequency of TF motifs in each activity class.", "id": "fig2c", "label": "Figure 2c" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "in validate_matrix(): Row sums in df are not close to 1. Reormalizing rows...\n", "Figure 2c\n" ] }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAb4AAAEuCAYAAADx63eqAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAIABJREFUeJzsnXd4lFX2xz9nZtJIAxJKEnoJkABBqSJi7yAi6yoq2LFhwXV1XXXtrqv+dnXtZVVExYoCdhQLIAgoNZTQSwKk9zbl/P54J3UCCSQhCbmf53mfzNx73jvfmbwz5733nnOvqCoGg8FgMLQWbE0twGAwGAyGo4lxfAaDwWBoVRjHZzAYDIZWhXF8BoPBYGhVGMdnMBgMhlaFcXwGg8FgaFUYx2cwGAyGVoVxfAaDwWBoVRjHZzAYDPVERKSpNRjqjnF8BoOhyRGR/iJykogEiIjDW9aSnEkwtDjNrRbj+AwGQ5MiIhcB84D7gbeA20SkvapqS3AkIjIeWCoiJ7UUza0d4/gMrZ6yHyrzg3X0ERE/YBJwraqeDXwEdATuLnN+TSqwFkQkHngK+BV4TkTGGufX/DGOz2CAPgDN/Uf2YIhIQFNrqAc2IBwYBKCqnwPzvXWXiUhz/41KBR5T1RuA54EXRORk4/yaN839ojIYGhURORPYLCK3NrWWI0FEzsbqafyzbG6sJeCdz+urqiXAv4BzROR0b/WvwCrgJJrpb5R3TrKTqqYBHwCo6lvAv6nq/Pq08BuTY5JmeVEZDEcDETkXeALrTr2TiAS1pLt0ETkN+C/wHXAOcF/TKqobIjIOeAOI9RatBr4GJovImWrxIRABDGkimQfFe928BIQAqKrbWy6q+jaW83tKRF7wPg5qIqmGg9Bi7hANhoZERE4AHgNuBrYDPwDfq+pPTamrrniHAC8BXlTVOSKSBkwQkanA98CBsh/k5oSItAceAK5T1UUi4lDVPBH5BHABd3jnzXKAGGBvE8r1QUTOx7pZul5Vt1Wvxhoxf0tEzsL6/5yhqtlHW6fh0BjHZ2hVeO/KFetufZqq/u4tfwO4T0QSvcNXzRpV9YjISuB8EcnA6rW+A1wGDMfqkWxsQokHwwbkeZ1eR+BpEQkFPgeWAWuA24Ai4DJV3d90UqsiIm2AK4AsVV0uImFYN07tgbmqusRrdxrWnOVpqrquyQQbDoq00Pl8g+GIEJFoVU0REX9VLfVGFbqA7sBDwH9V9Q8RsTfTHlM0kOF9Ggn8GRgMFKvqTSLij+UAE1X10SaSeUhE5HVgBXA+ViBLBnABsFJVX/TOVXpU1dOEMn3wDoPHAjcAXYEewFeAB7gSuEZVfxKRDkCQqu5uKq2GQ2N6fIZWg4icA/xDRFYDB0TkGaDQ2wPcKSIFwKPA+c3U6Z0DPAhsxuoRPaCq/xGRYcBVlZz6YiCquThv77ByOICqfgP8CJwIuFT1Da9NNvB3EXlHVfOaTGwNiMhIoA2Q470peg54FvhYVZ/y2mQA14rI4pYwYtDaMY7P0CrwRm8+BdwOhAHjgdJqKQwzgG9FZIKqzm0CmQdFRE7FCmS5BijG6umdA7wL5AL+WInfTuBi4KJm4vTOA/4JfAsMFBEXMBfoBgwTkRtU9VWsoecCrN5Ts8EbyPJfLGfdSUQOqOo071xqYTXzLFV1HXWRhsPGOD5Da+EU4FZV/VlEemKF0D8sIulYd+57ADvwJdZcU3NjGPCoqi6G8nmkscC7qpokIp8DvYEBwIWquqnppFqIyPHAI8CNqrpURB4DQrECQJ4Ukd3AGBGZB0RjBbwUNKHkKoiIHWsI8xFVneWd0/tGRD5S1T9XspsCXOW1NbQAzByf4ZjGO8zWEcj2Or0Q4BPgF2AdcDzWfM1tqlrYXIYHyxCRWCyHvA9op6o7vOUjgRmqeqn3uV1V3SJiay5zYyIyArCp6jJvNOd64HcgD9ijqvd47eKBNFVNbTq1NSMi9wApqjqrUtkiYIOq3uB17vdgJbGbQJYWgunxGY5ZROQCrJSF1UCwiKSo6hYRuUtV13ttMoGrVLUQKnKymgMiciHwMFa6xXZgm4i8papFWKHzPb12U4F+IvIg0OT6RSRWVZO8kY92b+rFZOBeVZ0pIjHA+yJyqqr+qKqJTSy5CmX6vU+Tgb+JyM+VglUmAq+ISF8sZz5NVXOaQqvhyDAJ7IZjEhGJAG7BComfipUXNtRbXjn/qhfQxRtS32zw6rwBmKyqE4FE4Grgr95e6x5gu4hcjDVvOUtVXU297Jo3OX21iMwG60bC2wN9Q1VnesuSga1ASdMprZlK+stWY3kX+AxYIiLdvGXpWJHAEapaapxey8M4PsOxigtrxYz+3rmZU4DLsQIV7hORdiIyHbgDuKu5RRJi6Q8BOgOo6pvATqwUhvOxhgvPxRpmm9pM5vSCgbLPtERE3q1U7a5kdxFWCkby0VV4aKrpL67kvB8A3gbmi8gNInIflv4DTaXVUD/MHJ/hmEVE/gTcCziBL1X1UW9QyBSs7W9uBP7ZXOdmRORGYAxWRGR/rFzDRcAI4DpgAVbATrNJVPfmGeYCgcArWPmFV3jr/IBpWJGpV5YNNzcnatDvVNXJ3rqJWDciQ4Fnm6N+Q90wjs9wTCMi7YC/Az+r6hfess+Af6vqoiYVVwsiEg6MA07HCs6501v+taqeK9ZyX802fN47XPsaUKSqV4hIf+BsrJuQrU2rrnYq6S9V1cneIJx8Vd3VxNIM9cQEtxiOaVQ1S0QWAn8WkVKsO/nuNLM1IGvCO3f0nojMLovU9AaytPXO8zWb0P+aUNUMEbkBa1myzVgBOWOb0zJkh6IG/XasIXNDC8c4PkNrYCnQF2v3gmLg6rK0gJZAJad3DXAXcImq5jetqrqhqukishZrPvLMluL0yqhBf7O/YTLUjhnqNLQavJGboqq5Ta3lSBCR7oBfSxgmLMM71PwR8BdVXdvUeg6Xlq7fUDPG8RkMhkZFRAJVtbipdRwpLV2/wRfj+AwGg8HQqjB5fAaDwWBoVRjHZzAYDIZWhXF8BoPBYGhVGMdnaNWIyLSm1lAfjP6mpaXrb60Yx2do7bT0Hy6jv2lp6fpbJcbxGQwGg6FVUZ90hlaTB1HidlHstpZEDHH4Y7c1+/sFaWoBDUWgiIY24v1ZMUpgC/64jP6mpTH15+GhWLXlfjjNGLNkWR34+/If+Pe6ZQB8fe7lnNO1TxMraj2EYmMSbZpahsFw1PmUwqaWcMzS7LsuzYEVaSmVHjerLcQMBoPBcJjUyfGJSBcRmSkiu0SkWEQ2rlixorG1NQtcHg+/p+8rf748NeUQ1gaDwWBo7tTq+ESkDbAQmIq1sv0sICslpXU4gA1ZaRS6nIT7BwCwPC0Zs8ybwWAwtFzqMsd3HtaWLvuA41S1EMDpdLaKX//l3qHNs7r0ZmHyDlKLCtidn0P30LZNrMxgMBgMR0Jdhjp7ev+uK3N6AH5+fo2jqJmxPNVyfEMjozguMsoqM/N8zZo27dpx45z3eS5/P4/vTGT45ItrtAsKD+fKt1/lqQPbeerAdsY9eG95XbuuXXg2b1+V4xXN44w7b20R+gHGP3I/D6xdxovOLJ+6xqKhtM9Y+CVPp+7gPznJ3L/6VxIuOP9oyG/x146hbtSlx1e2YecgEQlS1SIAl8uFw9H8gkJjYmJqtUlOrnBcT61eQmZJEf52Ow8ef7JPqkJZYMvxkVFklhTxffJ2VqSmcHGv+IYV3gIQkcuAO4H+QB6wGngcOANrk9cSwAVswNq/bKn3vPHA60CcqmZ6yyYAL3nLchpS5+QX/w9XaSl3d+pNlyGDmf7lx+xds459GzZVsbv4P0/i3yaI+3rEE9axA3f8MJ+MXXtY+va7ZO3Zyx2hUeW2ET268+jWNaz6dG5DSm00/QBpW7cz5+4HGHvjtY2uuaG1f3T73ezbsAmP202PEcO44/t5/CP2OHL3H2j2+pvy2jHUjbr0+L4CtgBRwCoReVVEfv7yyy8bV9kRkpycXH7UVFa5fNmBvdyz/Hv+tWYJj/7xC/N2ba7SVqHLybpM64t2fGQUx9fQ44uJian1OBYQkTuBZ4EngE5ANyzHNcFr8qGqhgCRwI/Ax2Xnqup8rHni/3jbagu8DNzU0E7Pv00bjps0gXkPPEZJQQHblixlzbyvGDllso/t4PHn8t1Tz+IsKiJj126W/G8Wo6+ZUmO7o6ZOZssvS8jYtbsh5Taq/mXvvE/iNwsozstrVM2NoT15XSIet9t6oordz4/2Xbu0GP2VOVrXjqHu1NplU9VCETkd687+FOBKYEd0dHQjS2t8/m/tUgAm9x7I7G3reWbtUib2HFBevyp9H25VuoeEExHYptzxrUxLwe3xYLfZqjjSMidXuawMVWVfgZvNmaXklHgocHoQICzARtdQPxI6BjTiO60fIhIOPAJcrapzKlXNB+aLyENlBarqEpH3gL+LSAdVTfNW3QZsEJGzgUuBn1V1XkNr7RTbB4/LReqWik3Kk9esp+/JY2q0F5FKjyF64IAa7UZNncxXjz7VsGJroLH0Hw0aWvvN8z9mwBmn4BcYSOI3C9i18o/GEe6lpV87hrpTp3QGVd2jqlNVtZuqBqrqgOHDhze2tkZlW24mc3ZupH1AEG+cfAGx4RH8emAPv+7fU25TNr9X5HZx+cI5/GPlTwAUuJxszE6v82sl5zn52y/p3PVTGnO35uPyKF3DHHQL80OB5fuKGvKtNQYnAIHAZ7UZiog/VgRwBpBVVq6q6cDtwHvAOCxH2OAEhIRQlFu1h1OUk0tgaIiPbeI3Czj7b3cSEBJCh969GH3NFPzb+CbL9xkzmtBOHfnjk88bQ3IVGkP/0aKhtb80/mJuD43i+XMvYsN3Cxs9mrqlXzuGutP8JumOEs+uW4ZHlbO79GZ7bhbnde1DUk4G/7d2KaM7dwVguXd+L7WogPe3rqty/vLUZAa271in13phVTZ78lwE2oUnToqkjZ/v/YZHlcXJRWzKKGVfgYsOQQ5C/W3YxFobzuVRzukZTOfgJvmXRQDpquo6hM2fRWQcEApkA5NqsF8GhAMfVeoJ+uBd8X4aQMhhLgdVkp9PUFholbLAsFCK8/J9bD+87W4uff5pHtmymoKMTFbM/oThk//kYzfqystY9ek8SgoKDkvLkdAY+o8WjaHd43KR+M0CTrv9JtK2bmft/K9alP6jee0Y6k6rdHwZxYW8uXk1ALO3rWf2tvXldZ/t3MjWnEz6hLcv7/HVxPK0ZK7pf1ydXi/U33J0LlXySj01Or45SfnM2WJ9we4Y2pYRUUF1fj9HgQwgUkQch3B+H6nqFSISCXwKDAV+qmbzGvAOcJGInFAW/FIdVX3Na0sHsR/Wbf6BpK3YHA469ulN6tZtAHRJGEhK4kYf28KsLN684rry5xMef5Cdy3+vYuMXGMjQiy/klYmXHY6MI6ah9R9NGlO7zeGgQ++eB61vCFr6tWOoO63S8b2yYSWFLifjusXy8piKMOmH//iJNzat4j/rlvLIsFPZnmeN1P1x0TQiAq1hjPm7NjN9yddVljGrjVuOa8vMxFxWHSjhb7+kk9AhgI5t7NgEMos9pBW5uGt4e0rcysaMUmZvzGNNWgmRQXb8bEKJ23KYTdjjW4oVsXkh8MmhDFU13dtjWyki76vqPgARuRboijXMuRJ4Q0SOU9XShhRaWljIqjnzGP/Ifcy6bjpdhwwmYcL5PDX6DB/byF49KcrOoTA7m7izTuekaVfzfyefU8VmyMTxFGZls/nHXxpS5lHRb3M4sNntiM2GzeHAERCA2+lEPZ5mrb1Tv1gie3Yn6adFuF0uhl0yib5jT2TO3Q80iu6G1l/G0b52DHXnmHR8+c5S3t2ylos//h9dgsP4y9JvmdI3gSGRnQEYKOm81svBsOAddMhZifgF4yk6wD0RuYyK60ybgNxyxxbdJrQ8fw/g9JheAKzNOEBhcQ66bRaurETsIT3YtvR1xOZHYdLboArqxr/zSQSF9mFETAGl9j0k53lw2drza1o+4X4BDI5sz4V9OxLsZ+PyuDDACoTp2qMX9oAg1qxdh79dCHJIlcn0o4mq5ojIP4AXRcQFfAc4sdIYToWqq+mq6mYR+Ra4G5ghItHA08AEVS0RkVeAyVgpEA82tN7ZN9/J1Ddf4unU7RRkZPL+TTPYt2ETfcaMZvrXn5aHmncfOoSLn/0XbdqGcyBpK29efq1P2PqoKy9j2awPGlriUdE/5fUXOOGqy8ufn3f/3cy86kaWznyvWWsXEcY9dC9RcTPxuN2kbtnGG5dcxZ5VaxpNd0PqL6Mprh1D3Wgx2xKVuNxkFZeQV1qCqhLs5wCx1tJ0iI1Ah51Aux/zd25i1tZ1RLcJZXSnrnQICmJ1xn5+S0thULsOXNd/KH8+MY7+UU6CA5QDOXYKSwU/uxIepMS0d1NYKiQ8+S3/Xf8bJ3aM4uUTz0bEukdwe1yMnD+LEreb98ZeQGTWEgrT11BiC8IW3hd/ewDFRQcIzEsiSIsI7XIGhe2HUlCUSVs/fwYfd275e1q7fjE5xfkE+wfRNjSavTkH2FuYS7HauebEswF4e8k3tPUPoGtwGBHBEbhcTlxuFyI2HHY7IoLgQdUK/bbZ/PHzC2gUDykilwMzgAFYeXy/Y0X7ngX0UdUrKtmOxEph6Ik1bJmiqjdXqu8HLAdGq2riwV6zg9jV7M5gaI18SiFp6jbbEjUC9XF8BkOjYxyfobViHF/jYbYlMhgMBkOrwjg+g8FgMLQqjOMzGAwGQ6vCOD6DwWAwtCpMcIuhWSMiacCuptZhMDQB3VW1Q1OLqIyItAf+hxXJnQ7cq6rvH8Q2AGsh+zOA9sA2r/3X3vqfgFFYO7oAJKtqv0Z9A16OyTw+w7FDc/viGwytnBeBUqwdWoYAX4rImoOkJDmAPcDJwG6sTc0/EpFBqrrTazNdVd9ofNm+wo4U01WshCt3OzmLb8KVuR5HRAKhQx/Cv+MIwFqHM+bdf1PkdrLgvCkM72jt4nCkWxbVtPtDNUwItMFgqBUR+QAoAr4GvlPV7EPYBgOTgIGqmg8sFpF5wBTgb9XtVbUAeKhS0RcisgNrOcOdDfUejgQzx9cAuLI3kf7ZMEpTFuIpTqU0eQEZX5yCepwApBTksb8on5zSksPa1cFgMBgambuxFpKYAuwSkUUicq+IDKnBNhZwqWpSpbI1QJ125RaRTt42KvcO/yki6SKyREROOaJ3cASYoc4GIHfF/aiz2mafHqe1bBmwOafC2SXlZBxNaS2eyKAA7R4e3NQyjhy/lv0VE4dfU0uoH420LunRYGdGDun5heWjN13FocV1GGhLx5MIFFcqes278LsPqroba1Pol71zcqdhLSP4hIh8rarnVTIPAXKrNZGDtSPLIRERP6wtyWaqatnabvcAG7CGTi/F2ttziKpuq629+tKyv5XNAE9JFiW7vwAgeNAMQoc+jDP9D7IX3VBuszk7o8bHZUOWh9rAti71xzLdw4P5bcrpTS3jyOnUqakV1AuJrNvWW82WosLabZopI5+cWeV5Ccol4rs3YHVe1NxiVR1W19cRkQ7A2cC5WEEre4En8N1/Mx8Iq1YWhrV84aHatwGzsBzc9LJyVf2tktlMEZmMNQ/4fF21HynG8dUTZ8ZaUDf2sL6EjngSERv+nU8k4vzvQexARS8vvl2HKr0/g8FgqCsCOOoye38Y0Rci8ibWris/AF8Bd5XtqFIDSYBDRPqq6hZvWQJVhy6rty9YUaCdgPNU1VmL8qMSn2Dm+OqJK9NaMT6g6zlYNzYW9jadEZvl+DbnZNA5KIQhEZ3ZkpOJx6SQGAyGI8BWh+Mw+RfQUVUvVtW3DuH0yoJV5gCPiEiwiJwITMDqzR2Ml7EWtR+vqkVlhSLSVkTOFpFAEXF4F8AfC3xz+G/h8DGOr564sjcD4Nfu4PO7m7PT6RPenj5h7Sl2u9iTn3O05BkMhmMEQXBI7cdh8irgFBGt4Vhcg/3NQBCQCswGbqqcyiAiX4vI372PuwM3YKU97BeRfO9xOeAHPAakYeUD3gpcWC1wptEwQ531RF3WrukS0A6AgvXPUZLyMwAhx/0dbT+EnXnZnBzVgz7h7QGrB9g9tG3TCG4FZBaVMu3blSzYeYDIoAAeGzuQyQO6+dj9tDuVx5ZuZNWBLNoF+rN1WsU8fmpBMXf+uIZf9qRR4HQRHxnO06cOZmRUROPrLyxm2kcLWZC0h8jgQB477wQmHxfrq3/rXh77fiWrktNoFxTA1r9PrVK/MzOX6z5ayPLdB+jWNoTnLhzL6bFdG1d7fiHXvzWXBeu3ERnahscnncHkEwb72Kkq9368gDd/+QOAa8Yezz8vPrN8z8mFG7Zzz4ffsTU1k8jQNtx93hiuP6XO01ZHrr+giOvf/ZoFG3cSGRLE4xNOZvLwuJr1f/4zb/5qjfhcMzqBf154crn++Wu3cv/cn9mZmcOg6A68dsW5xEVF1ktbnYc6DwNVPeUw7TOxhkYPVn9upce7OPTQ5fDDee2GxPT46ol6rH3wxOYPgDNjNSW751Oyez6e4jS25mSiQJ+wdvQJ8zo+k9LQqNz2wyr87TaSbx7PzPNHMH3BHySm+/ay2/g5uGpgD5482feHOd/pYmjndvw25XRSp09gSnx3JsxZQn6py8e2wfV/9gv+DjvJD17NzMvOZPqcn0nc7xsN3Mbfj6uGD+DJ80fX2M4V733HkOhIDjx8LY+cM4pLZn1DWn5RjbYNxa3vfom/3U7Kc3/lnWmTuGXWFyQmp/rYvf7TSuat2sQfj9zEqkdv5ss1Sbz200oAnC43f3rhA64/ZSiZL93L+zdezF0ffMua3fsbVTvArR8usPQ/OZ13rhrPLbO/JTElzVf/4jXMW7OFP/5+Navuu4Yv123ltUWrAdiSmsnUt+fz4uSzyXjmDsYN6sPEVz7F5a5nhKlYm/TWdhhqxzi+eiKOIAA8pTXnfW72Brb0CW9P7zCrV9jSUhpE5FIR+U1ECkQk1fv4ZrF4W0RKKw1j5IvIJd7zdorIGd7HUSIyT0RSvMMoPRpDa0GpizlJe3noxHhC/B2M6RLJuD7RvLdht4/tiKj2XBHfnV5tfdMlerUNYcawWKJCgrDbhOsTelHq9rA585ABbA2g38mcddt46OyRhAT4M6ZnNOPievDeH74jQCO6deKKof3o1b56oB0kpWWzKjmNB88aQZCfg4sG92ZgVARz1jVepHhBSSlzVm7k4YtOIyQwgDGx3Rk/pB/v/uq7c/o7S9Yw4+zRdGkfTky7MGacfQIzF1uOI7OgiNyiEq4YnYCIMLxXDAOiItlYgwNqcP2rNvPw+JMICfRnTJ8ujB/cl3eX+8ZuvLNsHTPOGE6XdmHEtA1lxunDmblsHQDfbdjBmN5dGNOnCw67jbvPGkVydj4/b/G9Bg+Hsh5fbYehdozjqwMxMTEHPezB1tCRpzAFgPCT3iDk+H+Un1vm5F7esJKbF38FVE1paO6IyF+A54Cngc5Y0Vk3AicC/l6zp1Q1pNLxYQ1NebAmric1pt6krDwcNhux7StSixI6hLMhvXr60eGxOjWbUreHPu1qDyevD0lp2Zb+DhVD4QlRkWzYn3lY7WzYn0mviHBCA/3LywZHRbLhwOG1czgk7c/AYbcR27liSG9w185sqKHHtyEllcFdO1e1S7HsOoWHcOnIQby9eBVuj4elW/ewKyOHE/v6Dlc3qP7ULOuz79S+QldMBzak+I7QbNiXzuCYilSPwV06smFfxfe6cviaoqgqiTW0czhYjq/B5/haJS12jk9dRXiceaCKzT8ccQTiUSUxvZSNmaVkFbtRBZtAmL+Ni/uFYrcd2UVROX8uJiamyvOiHVaqS+m+nyHhbiuS05vGABVO7ud9Fessb24hPT4RCQceAaaq6qeVqlYBl3tt6tSWqh4AXhKRRr3mCpwuwvyrvkRYgB95pYeKoj40uSVOrvpqOQ+MjiM8oHETugtKnIRVe42wIH/ySkoPq538UidhlZweQHigP8k5BfXWeNDXLCklLDCg6mu2CSCv2Fd7fnEp4W0qbMODAskvLkVVEREuGTWIG96ay4z3rSC/F6eeT9eI8EbTXq4/qJr+oIAaP/v8EifhQQFV7PJLLP2n9+/BvZ//zE9JuxndK4anvltGqdtNYT2uwTJMT6VhaFGOTz1Ocpf9hZK93xHYbRyOtv3B5sBTnIF/x1G8mhLH0pRiTu0axGUDwgj1t6GqpBW5qe332V2UhjP1N9yFyYg4sE4Q/KNOxhHW66Dn+bUfBEBJ8kKc6avwizyuSn1NeXu783MocjkJav6rYpwABABzm1pIXQn2c5BbbR4ur8RFqP+RfdZFTjcXfraEkVER3DOyf0NIPCTBAX7kllT9gcwrLiU0wP8gZ9RMiL+fj8PJLSklNLDxrrmQAH9yi0uqvmZRSZVeZ7ltoD+5RRW2ucUlhAT6IyJs2pfG5S9/zMfTL+XM+F5sOZDJhGffI6ptGOcn+Ab5NKj+omr6D/LZhwT4VXmvucWlhARY+vt3juCtqedx+4cL2Jebz+XD44nrHEmXdrUucHJIGiO4pbXSohyfpziDom0fYfMPI6jftdgC2lK05V3yV/+TgJgzKA1/BYAAhxBgt64QESHIcej7JFfuNtI+GYQjPJaIcT+i7hJK9n5rpSqI7ZCOzx7WG3t4P9w5m8n44lQCuo2jdP+i8vqk7Az6t43kgePHAvDZjk18smMDW3IyGRxx5Kt6HGqB6wZc4SUSSFfVck8iIr8CcVgO8Wxv8V0iUrYig0tV6xW+JiLTgGkA3ULbHNa5se1CcXk8bMnKo6/3h2ZNWjZxkb7zYLVR4nIzae6vxIQG8fJZxx/2+UdCbIe2lv60bPp6hzvXpGQQ17l9LWdWJa5ze7Zn5lpO0+t41qakc2kN0aENRWznCFxuD1v2Z9C3sxX9unbPAeJifFd/iYvuyNo9+xnRq4tlt3s/cdGWXeLeVGI7R3D2oD4A9IuK5LyEWL5du6VRHV9sx3bWZ5+aSd+O1ue9dm8qcdG+l3NcVCRr96Yyokd0hV2liN9Jx/dn0vHWjVJ2YTFvLl3LsO5R9dJXNtRpqD8tqufJw+8pAAAgAElEQVRsb9OZjpduJS/hceb99irf//IYf6SnEDzscYIHzeDWeA/T+2bhLkrnlZV7eWnFXl5esYePEveXLZvJnvyc8vm5vFLrjs0W0hP7qfPJ7zSB7StfZu+muaTn55OTn0WO2xrO2J6bxbRf5nPT/NlM+2U+23OzAMuxhiTcDYC6Cije/mH5fF+m00NGSRHDO0RzWZ9BXNZnEOO69wVq7gkeDsnJyVUcXNnzBl7WLAOIrDw8qaqjVbWtt67s+nlGVdt6j/rFbFuv8ZqqDlPVYZFtAmo/oRLB/g4m9o3h4SUbKCh1sSQ5nflbU7g8znd+yKNKscuN0+1BFYpdbkq9kXdOt4dL5i0jyGHnrXOHYztKPzjB/n5MHNiLh79bTkGpkyU79jF/ww4uP973B9/jUYqdLpweD6rW41KXFWUc26EtCdGRPLpgBcVOF5+v2866fRlcNKh342kP8Gfi0AE89PlCCkpKWbJlN/NWbeKK0Qk+tlNOTODZb5eSnJVLSlYu//n2V64cY62LPKR7FFsOZLJww3ZUlW2pmXy5JolBXRt3+bfgAH8mDonloS8WW/q37WXe2i1cMcI3R3fKyIE8+8MKkrPzSMnO4z8/LOfKUYPK63/fvR+3x0NaXiE3vv8N4wf1oX/n+qfCNEICe6ukRfX4St1u/rthE9/ucXJGzKXEtm3Pp3uSeO2Ak79H9eWOU+LoFuFmR5qdvGIbkSFuOoZ5EIEzP97GixtWsTZzP/d+O48DRXlMXjiHcd36MalHPD+WDCczZCijogIJD7CVz135OYSvk5bw4qa1nBzVnSFtQwh053L7L59yU2w858WeQFDfKXgKU8j7/UFQD4iDkCH3sK7NIGAlfcMr7tYrUhqOfJ6vqDCDlI0f4yzJY/GPb+JnU/Ylvg+l6QR2PpF2UUPr9TlXYilQgrU6w6e12DYbnj/jeK7/diXRL80nIsifF848nvjIcBbvTWPcp4vJvn0iAIv2pHHGR7+Unxf67GeM7RLJD5eewtKUDL7cvo8gh53I5ytGer+YNIYxXRp3i8DnLzqZ6z9aSPRDbxIRHMgLF51MfOcIFm9PYdz/5pP9uLUO7KIdKZzxyucV+v/+KmN7RfPDTdb7e+/ys7j2wx/o8I836NYulA+nnEOHkKBG1f7ClPO57s25RN32FBEhbXhxyjjiYzqyKGkX4/79Ljmv3AfAtFOGsT0tiyEPvARYeXzTvHl6vTu2541rJjDj/a/ZlZFNeFAgk0cN4tqxjd/rfuHSs7hu1tdE3fMCEcGBvDj5bOKjO7Bo6x7GvfgxOf+509J/0hC2p2cz5LE3Lf2jBzPtpIoNDWZ8/D1r96bhZ7fxp+P78cyk0+qtTcT0+BoKswO74ZCIyN3AX4BbgG+BAmAw8CMwEbgK2Kuq99dw7k7gOlX93vs8ELBjLXbbH9ilqsXVz6vM0M7t1SxS3XSYRaqbjpFPzmTlrn3lnq673aF/C6p94YubCzJ+P5xFqlsjLarHZzj6qOpTIpKMtW/XO1iObzvWliK/Yjm+ulI5e7psaxJzC2sw1JEjDEw3VMM4PkOtqOp7WHtp1cRVhzivR7Xn5mtrMBwhZWt1GuqPcXwGg8HQAhDAbvxeg2Acn8FgMLQQbGZmoEEwjs9gMBhaAGLW4mwwjOMzGAyGFoA11Gk8X0NQH8dn8iCaL8fOt8PtRvPym1rFESOlh7fGZrMj4PAWEGhuuFatbWoJR4wW+K6reux8seuOiCxT1VHexw+q6sP1bdMk+hsMBkMLoZWu3BLrzQEGK6e43pihzsNg8f7drM04wOCITozpbC2BVduamUdpTU2DwdAKaKUbzc4FkrwLYgSJyC81Ganq2Lo22GocX0M4oKfX/Mq8XZuZ0L1fueMzGAyGo4FgLXvU2lDVq0VkDNADGA78r75tthrHV1eu/mku2aXF/OP4sRwXWXU19TUZ+62/mQeaQprBYGjlHK3F0psbqroYWCwi/qo6s77tGcdXif2F+bydtBqA0Z26VHF8WSVF7MrPwS7CzrxsskuKaRsQ6LNJLeCza8LB6gwGg6GuCK02uGWsqpYNb+4UkRpX/FbVhXVts9U4vtocFMBvqXvLHy87ULVubYbVyzstuicLkrezNvMAY6O6N5ZcQ31oE4J96gxkwPGQn4P787fRFT/5mEnsYGznX4Z06wOF+bjuu6pKvX3Gk0h0D3D4QcZ+3PNnoWuWNbr8zOJSpn2/mgW704gM8uex0QOY3K+Lj91Pe9J5bPlmVqXl0C7Aj61Xn1lel1pYwp2/rOeX5HQKnG7iI0J5+qSBjOzcrnG1FxRx/ezvWLBpF5HBQTw+fgyThw3wsVNV7p23iDeXrgfgmhMG8s8LTiqfw5q/bhv3f7GYnRm5DIqO5LXJZ1XZ767RCA7FcfWd2OKHQl4urk//h+e3H33MpH8CjvFXIN37QmEepXdPqVLv99enkS49wOGHpu/H/dlMPKuX1lteQ63cIiLtsYYMzwLSgXtV9f0a7P4KXAl099q9pKpPV6rvAbwFjAR2A9PLFqVvQF4CBnofH2yYU4GDb5xajVbj+OrCb6mWsxvWIZrf0vZWqSsb3ry090AWJG9nTcZ+4/iaKfbJt4DLievuyUiX3tinP4xr73bYt7uqYWkxnl+/gxU/Yz/3Ep923B+9Yp3j8SA9+mG/4wlc/7gOvHsxNha3/bQOf7uN5OvOZnV6DhPm/cbgyDDiI6puptvGz85Vcd24xO3mXyu2VKnLd7oY2rEtT58UT8egAN7csIsJ835j61VnEOLfeF/7Wz9eiL/dTsrjN7J6bxoXvPoZg2M6EB9VdZvG139dx7x12/jjnimIwDkvfkrPiHBuGJPAltQspr7zNfNvnMioHlE888NKJr7+OYn3XY3D3rhxi44rbgWXi9I7/ox0643f7Y/j3LMdTdlV1bCkGPfib2H5jzjOn+zTjmv2S9Y5Hg/Sqz9+d/2L0nuvhpzMI9YmSEOu3PIiUAp0AoYAX4rIGlVN9HlZmAqsBXoD34nIHlX9wFs/G2v7svO8xyci0ldV0xpKqKoOrPS4Z0O0eYxGvx4Zy1L30i0knHO79iG5II+9+bnldWsy9mMTYWLP/thEWO2d72stiMhOESkSkXwR2S8ib4tISDWbh0RERWRktfJbRWS9iPhXKrtDRFZV3uS2QfAPQI47Efe8WVBSjG5LRNcswzbSd2sj3ZmE/rYQ0vfV3FbyTvB4yqzB7kDaN+5efAVOF3O2pvDQqP6E+DsYEx3BuJ6deW/TXh/bEZ3bccWArvQKC/ap6xUezIzjexMVHIjdJlw/sAelbg+bsxsvJ7KgxMmcNVt4+PwTCQnwZ0zvGMYP7M27Kzb62L7zWyIzTh1Kl3ahxLQNZcZpQ5n5m/Wb+92mnYzpHcOY3jE47DbuPmM4yTn5/LzV9zNoUPwDsQ0dg/uzt61rZ0sintVLsY0+w8dUd2zGs/R7NK3ma0f37qi4drSBrh2xVm+p7ai1GZFgYBLwgKrme+fP5gFTqtuq6lOq+oequlR1M1aE5YnedmKB44EHVbVIVT8F1nnbbhREZO5ByuccTjvG8XlxezysSEthWGQ0Q71ze5WHPldnHKBPWHvaBQTRJ6w9azJaZYDLeFUNwbpDPA64t6xCrDGqqUCm929lXgSygfu8tr2Ah4FrVdXVoAo7dQGPG1Irhqo1eQcSfWS9c/vND+F4fi6Ovz2HJq1Fd22p/aR6kJRVgMNmI7ZdxT1FQocwNmTk1avd1Wk5lHo89An3dZINRVJqlqW9Y8Vw6uCYDmzY57vp8ob9GQyO6VDVbn+FnVZaH0NRVCFxX3ojKbeQzjHWggmVpjl0z7YjvnYctz+K/6tf4v/AC+imNejOpHprtCO1HkCkiKysdEyr1kws4FLVyoLWAL5bzVfC+x0/CSjrFcYD21W18sVZazv15NSDlJ9yOI2YoU4vG7LTyHeWMrRDFEMjowGrBzipVxwuj4fErFTO7doXjyrx7Trw1Z4tuDweHLbWd++gqvtF5FssB1jGSUAUcB3wXxGZoaqlXnuPiFwLrBCRT4BnseYK/mhobRIQ6LP5qBYVIIFHtvO4+6WHwGZHBhyHdO5q3b03IgVOF2HVhiLD/P3Icx75/UFuiZOrvvuDB0b0IzzAr74SD0p+aSlhgf5VysKD/Mkr8V29Jr/ESXhQxaow4YEB5Jc4UVVO79ede+ct4qctexjdM5qnvl9BqdtNYWnD3iP5EBAExb7Xji2wzRE153ruAbDbscUdj0R1q/e1I9R5P770WjaiDQFyq5XlAKG1tPsQVmfprUrt5NTQzsFzx44QEXnE+9C/0uMyegHVxqIPjXF8Xsrm9/5I30ehy4lDbOVlm7PTKXG7+XznJuyvV3zmm7PTiW9/5DtUq3pwZa7Dlb0Zdeah6kIcwfi1H4hfxJDaG2giRKQLcC5QOYrqSmA+8BHwX2A88GlZpapuFpF/Yu3cngGc3xjatKQYgqr+UElgGyguOsgZdcDjRhNXYjttApKWgq79rZ4qD06wn4Pcaj/weaVOQv2O7Kta5HJz4fzljOzcjnuG920IiQclxN+f3OKqTi63uJTQAH9f2wA/cotLqtiFBPghIvTv1J63Lj+H2z9ZyL6cAi4fPoC4zhF0aRvi006DUlIE1ZycBAWjxfXYxd3txrNuBX5nTERTU+od4NJAM3z5QFi1sjDgoMMKIjIdayTnJFUt+8cddjv1oKv3r63SY7CCWvZgOeU6Yxyfl2UHrGHNT3dUzEesTEvB6XEfNG9vTeaBI3Z8qkrGvDE40/+g/dlf4NflLIp3zUNLsijd93NzdXyfi4hi3ektBB4EEJE2wMXAVFV1ent1U6nk+LwsAp4AXlXV4oO9iHdoZhpAt5DAg5nVzIG9YLNDx2hITbHa69LTNzjhSLDZkQ5RjbpIbWy7YFweD1uy8+nr/aFfk55LXERtN+O+lLjcTPpiOTGhgbx8WkJDS/UhtmM7S3tqFn29w51rk9NqjMaM6xzB2uQ0RnSPqrDrXGE36bhYJh0XC0B2YTFvLl3PsO6dG1W/7k8Gux3pGIN6b3qla6+GuXbs1rVTHxpwkeokwOENQikbu0+gYgiz6uuKXAP8DRirqpUnWhOBXiISWmm4MwHwiQ6tL6p6tVfLr6r6en3ba33jdAfht9Rk2gcEccOAodwwYCgndOpCkdvF+szU8sT1e4eM4amRZ3DvkDFARUL7kSAitOl/Pf6dTqBoxycUbHoDV/4uChJfoGjHYc3THk0uVNVQrPH0/kBZqN5EwAV85X3+HnCuiJRP4ngDW14Fngeme+f5akRVX1PVYao6LDLQt7dwSEpL0FW/Yh8/xQp06R2HJJyA57cffG1FrFQFu/f+r/LjTl2Q+GHg5285vBGnIn0H4klad3h6DpNgPwcTe0fx8LLNFDhdLEnJYP72/Vze3zedwaNKscuN0+NBgWKXm1K3FVDhdHu45KuVBDnsvHXmcUcl8Tk4wI+JCX156KtfKShxsmR7MvPWbeOK4b7pDFNGxPHsj3+QnJ1HSk4+//nxd64cWTE19PvuA7g9HtLyCrnxg+8ZP6gX/Tu1b9w3UFqM5/fF2CdeCf6BSJ94bENG4/m1huh877UjdgdQ9TqSzl2xDRpuXTt2O7ZRpyOxg/Bsrv+C2Q2xVqeqFgBzgEdEJFhETgQmALN836ZcjnWzeqaqbq/WThKwGnhQRAJFZCIwGN8b3gZDVV8Xkb4i8g8RedX797CHMkyPD8grLSExK5U/9YrjlZPGAfDFriTGfzub31KTWZ1xAH+bnUeGnYrDZsPl8fDvdUtZXc8Alzb9rmZF6KlctOAjhnXozMasNL489xcGte/UEG+r0VDVn0XkbeAZ4EKsYc4QYLc3D0sAP+Ay4DnvaQ8AqcDtQBGWEzyTRsA9+wXsU2fgePoDKMjF/f4LsG830ice+/RHcd1xEQDSdyCOO58qP8/vhXl4ktbi/vc9IIJt3OVI1L3g8aCpKbjfeBL2bGsMyVV4/tTBXP/9aqJf/5aIQH9eOHUw8RFhLE7OYNy8ZWTfZI0SL0rO4Iw5v5afF/rSl4yNieCHSSeydF8mX+48QJDDTuSrX5fbfHHBKMbENF4+3AsXn8Z1739H1H0vExEcxIt/Pp34qEgWbdvLuJc/I+eZWwGYduJgtmfkMOTJdwC45oRBTDtxcHk7M+b8yNrkNPzsNv40JJZnJp7SaJor43r3eRxX/wX/5z6C/Dxcs55DU3YhfQfiN+MJSm++AACJHYT/Pf9Xfl7Aa1/h2bQG51N3gQj2CVNw3HS/de0cSMb1yuPo7q311icNl85wM/Am1ncyA7hJVRNF5CTga28QG8BjQATW/HzZue+q6o3ex5cCbwNZWHl8f2rIVIbqiMh4rBvrL7Dm9foBK0VkiqrOq3M7euQTrs1uW6K0Qhc5JR6C/GyM6GdFYm3csQdUCfW3ISLsyc9hVL84ABK3b6NtQCBbcjK4f8WPXNwrjkk9rbvT1KICbvv1a87s0pvXN/5BuH8A351fEe179lfvklVSxPKJ15eXHWp1lprq3t2ylhsWfcFFPQZwcnR3vtiVxI8pO/n4zIs5vV0AzsxEsAdBUEdwFSOuPPA4cbcfwa5CP/KdSocgOwF2QcSaO2/jJ4QHNFSaawXeBWKvK0tO9fbmdmLl7izEmvOrfEt7B9Zd4lARSQB+AYaq6lYRCfLaPqGqb3EIhnYI12WTRjf02zlqSA3zWy0J6dOnqSXUi5a8LdHoeb/xe3pu+Xd5gJ+/vhVR+03xCQf2/l5LcEuLRUTWAbep6o+Vyk4BXqic71drOy3J8bndLrJzU8grycPuF4K/3YHTVYI/HkKC2lLodLEjO5Ui9aNtUBsKSosIFA+RgYEEBXdi4b49ZJYUkRDRkXxnKVtzM+nfNpLh7SLJT1tHSUkeQaFd8HM4wFWAOnPxC44mtH3Flz8mJqaK8yoqzCIrewdFHiUwsD0uTyniLiHYJgSHdiOzJI/04gKwBxBgd1DoLKGdw07H4LakugSH2BnZrx8Au/fsJaUwD0EIcQSQlJNDbmkxUy+6CI/Dxueff0anwFCigvwoKkyhsLQIcYQiNhtuZxFt7EJocAf8AsIb3fF5y17GmotbrapDq9lHY92RDcNabeEjVX2qUv0pwCdAvKoetOtsHF/TYhxf01GT45sZWfs858j9e45lx5cFdKicBuXNBU5X1bZ1bqcejs9gaHSM42tajONrOmpyfO/UwfGNOLYd34/AN6r6r0pldwPnqeopdW3HzPEZDAZDC0BEWu3uDJW4CZgvIrdjpTF0BQqx0qfqjHF8BoPB0EKwN/z0fYtCVTeJyADgBKwFM1KA31TVeTjtGMdnMBgMLQChbmtxHut45/cW1acNk8dnMBgMLQEBm01qPY5lRCRBRBaKSKaIlHoPp4j4rot3CEyPz9Cska49cTzzXlPLOHLKd3doobhKardpxvhPbIzVs44OsvJPPmXHumOrA7OxEuRvw8oHPiKM4zMYDIYWgBnqBKAz8A+tZzpCfRyfyYNovpivh8FwrCFgNz2+mVgrQtVrGKjZ9fjKVjipicqJ459s30BmSRFnd+lN99A65y0aDAZDi0QQbK08qhN4ElgqIn8Hqix8oaqn1bWRZuf46oKqcu3P88h1lvD86HOZPnBEU0syGAyGxkWsXL5WzifADuAzWtsc3+78HHKd1qT72kpbBtW1t2gwGAwtERPcwhAgomyT6yOl2Tm+yg7qYIs+V3Z2aw+yV57BYDAcSwhmjg8rfy8OazukI6ZF5vGt9W4HNLpTV9ZlpuLxBvgkJyeXH2WUPY+JiTnoAdRab2g5ZGZmcdHkKYR06kqPuATe/+iTGu1UlXseeIjIbn2I7NaHex54iMrBYqvXrmPYSacR3LELw046jdVrG3cvvir6L7+SkKju9Bh4HO9/XPP2ZqrKPf94hMgesUT2iOWefzxSRf+02+6k/9BR2Nt25O33Zh8d7VlZXDTlWkK69qVHwkje/+Szg2t/6HEi+wwkss9A7nno8araZ9xN/xFjsUd25e33Pzoq2gEys7K56LrphMQeT49Rp/H+Z1/UaKeq3PPEM0QOGkXkoFHc88Qz5frTM7MYM/EyIgeNol38CEZPuJQlK/6ovzixojprO45xdgDfeffie6TycTiNNLseX11Ym5lKG4cf53btw68H9rA9N4s+4Y28SeVBqMvwqkeVp9csweXxMDU2ga4h4UdLHlC+s0I0EK2q6ZXKV2ENHfQEHsKKlqo8hLBNVRNEpAfWBVdQQ935wL3AQKAYa5+sGWU7MovIT8AowIkVCbwF+Bj4j6o2SpLY9L/cjb+/P/u3bWT12vWMu/hSEgYNJH5A/yp2r701k7lffMXqpT8jIpx1wSR69ujOjddeTWlpKRdeegW333wjN19/Da+++TYXXnoFSatX4O/fuAtPT7/rHvz9/Ni/JZHV69Yz7s+XkTAwvgb97zD3y69YveQnS/+Ff6Jn927ceO1VACQMiueSiy7kbw8e1m9C/bTffb/12W9czer1iYy79EoSBsYR379fVe0z32PuV9+y+ufvLO2TLrO0X21t/ZUQH8clF17A3x5+4qhpB5h+/6PWZ79qEasTNzHuqhtJiOtHfL+qe52+9t5HzP32B1Z/9zmCcNbl19KzaxdunHIpIW3a8L9nHqdvz+6ICHO//YELrr6ZA6sX43DU7yfXDHXSBvgS8Mdap7OMw8oyaJk9vswDxLXrwMD2Hcuf18aheoPVHx/M/kjaBfj1wB7+tvwH7l/5I28n1auHXh92AJPLnojIIKyLqDJPqWpIpSOhWn3bGurCsTarjAYGADHA09XOm+7duT0K+AvW5pVfSSPM1BcUFPDp3Pk8cv+9hISEMGb0KC447xxmzf7Qx/ad9z7gzltvoUtMDDHR0dx56y3MfNfqGf20aAkul4s7brmRgIAAbrvpBlSVhT/Xa6Wkuumf90WF/hNGccG55zDrA99ezzuzP+TO6TfTJSaamOgo7px+EzPf/6C8/pbrr+X0U8YSGBjQqJortBfy6fyveOTevxISEsyYUSO44JwzmfWhb4/1nQ8+5s5bplVov2UaM2dXvMdbrruK008ec9S0AxQUFvLp1wt45K+3ERIczJgRQ7ngzFOZNcd3f9N3PvmcO6ddTZeozsREdeLOaVcx82OrdxsYGEC/3j2x2WyoKna7naycHDKzc+qlT8SK6qztOJZR1asPclxzOO00qeM7UOBixf5ifthVwKK9hezMcVYZ7qiJIpeTpJwM4tt1IL5dB6Bi6LO58sHW9fjZbCREdGL21vVV3mNJyo9k/TiF9LknkPbpEDK/GU/R9o8bQ8YsYGql51cC79S3UVV9X1W/UdVCVc0CXgdOPIhtgar+BFyAtcjs+fV9/eokbd2Gw+Egtm/FdjqDBw5kw8bNPraJmzaRMCi+/HnCoHgSN22y6jZuYvDA+CpRdIPj40ncuKmhJVehXH+f3hWvOzCeDZvqoH/gwHL9TUHStu04HHZi+/QqLxs8MI4Nm5N8bBM3JZEQH1f+PCE+jsRNvnZHk6TtO3HY7cT26lleNnhAfzYk+e6cnpi0lYS4il5swoD+JFazSzhzAkF9hjDhmpu5dvKf6BgZUW+NZqizYWiSoc7sYjcvrs4mMb2UnuF+9Az3w+VRPtuSz4OjIwgPsB/03A1ZaXhUiWvbgV6h7Qiw25t1gIvL4+HjHRs4M6Y3p8f05C/LvmNdZiqDIzpRsP55cpfdiV+HYQQPvANbYAdcuVvxFO5rDCnLgCnelc2TsHpdJ2L11hqSsUDioQxUdbeIrAROwhoabTDy8wsICw2tUhYeFkZefn6NtuFhYVXs8vMLUFXyC6rWAYSH19xOQ5JfUEBYaEjV162r/vDQcv1NEfZuaa/+2YfWrL2g+mcfSn5B02m3NBXW8NmHkJdfUKNteKX3Gh4WQn5BYRX9axbMpbi4hM+++Z5S52FtHlAjJril4WgSx/fuhlwS00s5tWsQ1w0OL79Q6rIKTZmT+9/mVXyxOwlVWJeZ2qh668OPKTtILSpgUs8BnBrdg78s+47Z29YxqH1H8lf/ExDanz0fW2AkruxN2IOjQQ7u+OtJWa/vZ2AjUH0c9y4RmV7p+VxVvbLS8/RKP0qPqeozlU8WkTOxepIj66AlBahxYlZEpmHt7E63rl3q0FQFISHB5OZVXZ8xNy+P0JCQWm1z8/IICQlGRAgJrqGd3JrbaUis163qKOqsPze/XH9TUONnlpdfs/bg6p99PiHBTafd0tSmhs++gNCQ4Fptc/MKCAlu46M/MDCAyReeT9yp5zMkvj8Jcf2rN1V3xMzxNRRNMtS5O8/aNb5vO/8qF4qI1Hrhlw1rJuVksGj/bko9brblZpLvrFdaR5053CjP2VvXA3DtL/Po9cF/AfhgWyLqceMpyQR7AOLfDoDMBZNI+/Q4MuaPbVjRFczCCmC5ipqHOZ9R1baVjiur1UdWqqvu9EYB7wN/UtW6jFnFAJk1Vajqa6o6TFWHdTjM4aHYPr1xuVxs2bqtvGzt+vXEDejnYxvfvz9r1lV0TtesSyS+v/XDFD+gP2vXJ1a5GVubmOgTYNLQlOvfVll/InH966B//fpy/U1BbO9euFxutmzbXl62dv0G4vrF+tjG949lTeKG8udrEjcQ39/X7mgS26sHLrebLTt2lpet3biJuFjfXejjY/uwptKw95oNm4ivwa4Mp9PF9l176q2x7DfyUMexhog8XelxnVdnORSN5vic6b9TsPFVCje/RfGu+RTv/pLiXfNwZm3g9G5WTMVXOwrYlev8f/bOOzyq4mvA72wv6SFAQkIPvYMICAKiIk0UCyogCiqfvf3svaBiV4q9g12kCoIgiPTeQocEUijpyfYy3x93UzeVJNL2fZ592J177uzZS3bPnTOn4JUSt1eyJ8OBw74fQ2sAACAASURBVFPxqm9H5km0KhW96sfSq34szYLDkMDurLNv1efwuJmduIeLomL49fIb+PXyG5jYuiuJedlsSD+OofmN4LGTv/VVpNdF/Rt2ow5uUmf6SCmTUIJchgKza2teIURXYB4wQUq5rArycUB3athTqyzMZjOjrh7OC5PfwGKxsHrteuYuXMS4m0f7yY67ZTTvTZtBSmoqqWlpvDt1OuPHKvE/A/pdglqt5sOPPsXhcDDtk88AuKx/v9pW2V//EcN4YfIURf9165n7xyLG3XSjv/433ch70z8iJTWN1LTjvDvtI8bfclPhcafTid1uR0qJy+XGbrfjrcNuEWaziVHDh/DCG+9gsVhZvX4jcxctYdzo6/x1H3097834rEj36Z8y/uaiz1hCd7erznUHMJtMjLrqcl54eyoWq5XVG7cwd8lyxo262l//60by3mffkJJ2gtTjJ3n3s68Yf8O1AKzbso1/N2zG6XRis9mZMuMzTqSnc3HX0rFi1UNwwbYluqvY8zm1MWGduDq9Lgu5G57Ck3uY4J6vI9RGnCfXYU+cjbZed67s9ynBOhXLj1qZvDYDm1uiFtAoWMszvSKgnMgkKSXbM47TrV40a6+ZCMD8pH1c/eeP7Mg4wcX1q+cWqy6bT6UW/ts9KqZS+aS8HEY1acnVERoGy11Ij4024R7cjSJIy0mhZ58PUWnN5G+fQv62NxC6YKQzB13MZaTlu0nIcCKRNArSoFUJVAK8EmKCNJi0p33PMhEIl1JahBA1/v8XQnQAFgP3SynnVyJrAi4C3gM2AH/U9P3LYvq7bzHxnvtp0LwNkRHhzHjvbdq3bcOq1WsZet1o8o4fBWDShNs4fCSJTr0UYzbx1nFMmnAbADqdjt9/+I4773uIp154mbatW/H7D9/VeSoDwPR33mTifQ/SoGU7Rf9331L0X7OWodffRF5qkk//8RxOTKJT70t9+o9l0oSiRfrga29g5b9rAFizfiOTHnyE5QvmMKBfmbFHtaP7W5OZeP//aNCmM5Hh4cx4+zXat2nNqrXrGTp6HHlHFWfApNvGcjgpiU79Lld0H3czk24bW6T79bewcvU6RfcNm5j08BMsn/szA/r2qTPdAaZPfp6J/3uWBl36EhkexozJL9C+dTyr1m9i6K2TyNu3WdF/7GgOHz1GpytGKvrffB2Txio3Vw6HkwdfeI3DR4+h1Wjp2CaeBV9/TEzD+jVTTtRe1KYQIgL4ArgSSAeeklJ+X4bcQOB5oBuQJaVsWup4ItAA8PiG1kgpr6wVJYvYLoT4FUgA9OXl7Ekpn6/qhOJ0uztIr0t6rWlIj5Mm7foDcPTgVpASlSECodIipSS+eQx6DezYsQOV1gxqY4nleKNGjUClJvlokt8yvXTlljRrHjEz32VC6y580V/5gzucm0WLHz/kvvYXMfWSoQDY3W5aNGlS4lwAp8fDn8kH+b8+ymp5ztYN9IiKQQiB2+tl0bED7Mw8SYwpmGynnTEtOxJlNJNmzeOhNX+yJPkQd7Tpyud7t3JlbAve7zOYhsYgdmedYm7iXhqYgkiz5tE4KJThjVsRYj1Exry+aMLbEz7oR6SUONNW4M7ei7Zed4zNlX5bTRrH0CDEy8JlfxIZ0RKV1kxqvpv1abkkZKXTOERLltNCnDmUduH1iQvWVmb4SlxI3x/nHVLKv0qNa1Dy68rL47NLKesVy+PT+rofF5/jK5R9PWux4SQpZXvf8RUU5fEBHESpt/eOlNJe0YcA6NGti9z4z/LKxM5eAv34zizWc7cf30VDr2fTjl2F3+WuwSa5ont8RacAELZyx2YpZY+KZIQQP6B4/Cai5PIuBPpIKXeXkusJtAaMwNPlGD6/35baRAhRH2XV1wTlt6aszgyyOikNp234rJZMuTd5I8lWCxpTA1ReF15HFnEGNY0bdOaEjCTZakMjvITqtFjcToK0OmJMQRit+8g+uQWHNhy9uSEOew4adw4hWj2i/gCScx3YPZJIsw6tkHi9HuWXXK1mT3YaTYOCGdBJiZ84mniQ5alHiDIGY9TomJ+4l+N2K50iG7I76xRhWi1DYlsQrjPz06EENEJN16honB4nCdknqWcwcWVsPPMSD5PpcNAhIgqTRsVJm4U0ay7DmrRid2YmR3Nt9G7QmHpGDSDZcDIJk0ZH92AP+46uoqEhmNCwJgiPjeP5p7A5rXRpOQRV5nqcJ9aiMjVEpQtDui240reijR3CLvOl/JVyBI1K0DEiioSskxg1Gi5t2IwGeg+bjx8i2hTM4CuvAyH4acEsjCoVbSNbsCMjg6S8PIINOkxqNTa3g/oGI/Hh9YkwB503/o6A4TvDBAzfGcPP8IWY5Moe/nu9pQn9e1uFhk8IYQaygA4F+/FCiO+AFCnlk+Wccznw+ZkwfKXe7ysp5e01nqeG/fwCBKhTAobvDBMwfGcMf8Nnlisvrjx4KfSvLUko7ssCPpVSflrwwrcnv1pKaSo29j+gv5RyRFlzVmL4jCirx63AY1LK7ZV/utNHCBEOjEAJkEsBFkgpywyUK49zsmRZgAABAlxoCAGiasEr6ZW4OoOA3FJjOUBwGbKVMQbYgrK98iDwpxCijZQy+zTmqhQhRG8Ut+xeIAkYDrwvhBgmpVxb1XkChi9AgAABzhGEulYC8fOBkFJjIUC1l8dSytXFXr4uhBiPUpiiwkC3GvA+cI+UsrA2nxBiNPAhSuBclTgna3UGCBAgwAWHEKBWVf6onP2ARghRPFKmM5VUXKoiklLBdbVMK6B04dpfgfKTKMsgsOILcFbjObif3KsHnWk1ThtjixqGsJ9h7EnplQudxXy/JulMq3DanLL5ewtrI0Hdl8o0G3hZCHEHSlTnSMAvV0QIoULphKBVXgoD4JVSOoUQjVE6JGxEWUTdD9QDVpeepxY5gFJusXjqxQ3AobLFyyZg+AIECBDgXEAp1llbs90DfAmcBDKAu6WUu4UQ/YBFUsqCOnOXAn8XO8+GUvJwAMqe4EdAC5SWZNuAIVLKjNpSsgweAhYIIR5A2eNrCsSj7PVVmYDhCxAgQIBzBFFLds8XBXlNGeOrUIJfCl6voBzXpS/nr1PtaFQ1pJRrhBAtUDq7xKDsJf7xX0Z1XnB5EFJ6cZ1Yi3Tb0NbviUpXen/4rOG8yeMLECCADyFqK7jlnMbX/mxmTeYIrPiqiMeaRvbKCThTlDxNoQ0h9JKpGFvecoY1CxAgwAVDwPDVCoGrWAWklGT/Pa7Q6AFIVy62Qz9WcFaAAAEC1B5Ko9kLrztDXRBY8VUB14k1ONNWAqCPG4quYV/sibXW3CBAgAABqoAIrPhqiYDhqwKOFKXTjj5uGOFX/o4QAnPHR3AcrasczQA1QQSHYHzsBTTdeyNzs7F/9iGu5Yv95NRdemAYdxfq+DbI/DzybhlW4njw9wsR4RGFZcfcu7djffyeuv8ApmA0tz2MaNcd8nPwzP4K74a//cRE686oh49BNG4J1jxcT5Vsn6h59E1Eoyag0SLTT+CZ+y1ye5WLW5wewSEYH3keTfdeyJxs7F9Ow/13Gde+cw/0Y+5Urn1eLvm3lqyUFfTt/BLX3pOwA+tT99at7oA+PIwBMz4g9rIB2DMyWf/Cqxz85Tc/uc4P3kfrMaMJiovDnpHB7s++YvsH0wqPj/hjDhHt2qDW6clNSmLTq1NIXLioZsqJWktgP2cRQjwIzJJS1ijPJmD4qoArYysAxla3FroShEqNoWnJoKh8p5cTVjc6lSAmSINaJSpsXFu8c0R5uL1erG4XaiEwa+u+JU5F+OrymYBmUkqLb+wOYKyUcoAQQqJ0aZAoJZB+Qqnd5/HJrgBmSik/r0s9DQ8+hXS5yL1uEOqWrTG/9iGew/vxJh4uKWi34Vw8F5YvxjBmYplzWZ55CM+W9XWprh/qMfci3W7cj45GxLVAc/8ryOTDyNRSOWkOO97Vf8KGv1EPvclvHs+PHyHTksDrRTRrjeaRN3A9OxFyqhUAVy2M9z0BLhd5N16BukVrTK9+gOXwfrxJJa+9tNtw/TkX14rF6G8qu6i+9fmH8WzdUGe6lkXfd9/E43TxTfN21OvUgSG//kDGrl1k7dlXQk4IwfI77yVj125Cmjdj+NxfyE9J4dCvvwOw+rGnydq7D+nxUL9HN4bPn82PXS7GeuJEjfQLuDK5DJjs+y35Dpgjpax2QdkL+/ahirgzdwKga9C7zOOnrG7e2ZjJ3UtP8Ny/GTzxTzp3Lz3B4WxXmfLV4ZM9mwj9+g2afP8+nrOj4LEapSZfeXT25QD1B0YDVW4VUisYDGj7DcLx1Qyw2/Ds2oZr7Uq0V/in+Xj27sa1dCHetMpvQP4zdHpU3frimfsNOOzIg7vxbl+Lqpd/Er9M3Id33TJk+vEyp5IpR4qKZEtArUGER9Wd7gYDmr6DcHzzkXLtd/uu/aBhfqLefbtxLfvjrLr2GpOJ5iOHs/GV13FbLBxfu56kPxbTqowmwNven0r69h1Ij4ecAwdJXLiIhr16Fh7P3J2A9Pha1EmJSqshKLbyHp4VUnuVW85ZpJQjUdoTLULJ6TsuhPhcCHFpdeYJrPiqgNeh3CGr9JEA5G58DrwOEGoM3V/j1XWZnLJ6aBaqZWBjI2oh2HbSTo7DU7iqK91bsKosPqYUJMhw2NicnkbP+uWvIP8j3gIeF0LMqKgQrZTyoBBiNUpViP8MVWwT8LjxJh8tHPMc2o+mc/fTms/09GRQqfAc3Iv94/fxHt5fW6qWiWgQCx4PnCj6O5HJR1C16nha82nufxnRtitCq8O7axMyqe70VzVqAh4P3pSia+89fAB1p26nNZ/xyVdBqPAe2of9s/fxHj5QW6qWSWjLFnjdbnIOFhUBydi5m+gqNL+N7tOLhC+/LTE25JfvaTTwUjQGA0eXLuPklm011vFCd3UC+BLkpwPThRCdUFZ+twshjgGfAR9IKfMrmiNg+KqA9LpAqEGlBcCaMB3pygOVll2NnueU1UOEQcXzfSLR+zokD2xswluFlk+urARs+7/BnXsYPFZU5liMzW5AH3s5Do+bv1OPMDCmKX+nJrL42MGzwfBtAlYA/wOeLU9ICNEGpVjtm/+NWr73NZqQVkuJMWnJRxjN1Z7LOvlpPAf2ghDoR92C+c3p5I2/FiwVfqdqhsEIdmvJMZtFGT8N3FOfB7VaMX7RjaEO25AJoxFpLXltlGtvKueM8rG98Syeg8q11117M6bXppE/8bo6vfbaIDOuvJJ1mp25ueiCg8o5Q6HHM0+ASsXe70o2MF90wy2oNBoaDexPeOtWNb/2Aqhad4bzHiHEIGAsSqm1TSi/M0dRvFGLUH57yiVw+1AFhMYE0oN0+35Qhbrw2K50xb3cub6+0OgVoKrEH2898B3pv3XGfnQh+uh+GFuOQW2Ow3lK2VNaffwYFreL8a06Ex8awZ/J1SpHV5c8D9wvhCjLb7ZFCGEB9qAYyBnVnVwIcZcQYpMQYlOG0135CcWQNivCVNLICVMQ0mYp54zy8ezeDk4HOOw4fvgSmZ+H5jRXL1XGbgNDKUNhMCnjp4vHg9y1CVW77ojOvWqmXwVImw1hKmkkhMmMtFnLOaN8PAlF197541dISz6ajl1rS9UyceVb0AaX7MyjDQ7GmVe+sW0/aSKtbr6RRdfdjNfp9Dvudbs5tnQZsYMG0GToVTXUUElgr+xxPiOEeFsIkYzSjWEv0FFKeaWUcpav6szNQKV/KOf3Vaol1EFNAPDkK8EFDW89hSZSubYe302cppqbzlJK8jY+B0Dk0CWYOzyALuYyTG3uwNT6DgD+TD4IwGUxzbgsphnrTiaT5ajBD2AtIaXcBSwAyurW3A2l5NFo4GKg2kstKeWnUsoeUsoekbrqOSW8yUmg1qBq1LhwTN2iFZ7SgS2ng5TKPksdIk8kg1oN9Yv2g0Rcc//AltNBpUZERdd8nnLwpiSBWo0qJq7oLZvH+wcVnQ6yrov+Q87BQ6g0GkJbNC8ci+zYnqw9e8uUbz3uFro+8iDzh4/CkppW4dwqjYbQZk1rpqAAVKrKH+c3BuBaKWV7KeUUKWWJvSMppQuoqBchEDB8fjRq1MgvElMb2RkAR/ISP/mWYYr7c0+ms0quzUKkF68tDdRGVKaGAKTPuZiTPzQhfbayLbb42CGMag2f791CsiUXr5T8lVILPyK1wwvAnShdkEsgFX4G1qKsDv877HZcq5ajv/1uMBhQt++Mtk9/XEsX+MsKAVodQqMpfI5GMbSifkPU7Tsrr7U6dKNvRYSG49lV832aCnE68G5ZjXrkraDTI1q0Q9W5N951y8rWX6NVDGXhc9+NQsM4RIceymdSq1FdfBmiVQfk/p11p7vdjnv1cvTj/0+59u06o+0zANeyhWXrXt61j2qIul2xa3/DOERImLICr0PcVitH5i2kx7NPojGZaNirJ02HDWH/j6W74ED8jddz8YvPsODq68hLLHlTEtaqJXFXDEJtMKDSaIgffQPRl/Qm9d81NVdSyWKv+HF+8zpwsPiAECJcCFF4pyilLPtOpRiBPb4qoK3XHdv+r7Hs/ABD3FDUoa3Aq7g1escY+WlfHsl5bqZvzeaKpmbUAraedNAhUke7evoy51TSIa7DfuQXLDvextzxURrcksTJn9sindmkWvLYkamEPr+85Z/C8/48dogbmrcn3+nlaK6LTLsHryxw/QuahmqIDdbW9SUpCF75CXgAKO/X9A1gnRDiDSllQeihxtfapACP7y6t1rB/8BrGx14k5LflyNxsbO+/hjfxMOqOXTG/MY3cYZcAoO7UjaD3ijIrQv9cj3vbJiyP3IkwmTA+9DSqmDik04H30D6sT96HzM2pTVXLxDNrGprbHkH77s+Qn4tn1lRkahIivgOaB17Fdb+SRiPiO6J97K3C83QfLcC7bzvutx9XPt+IcYhJjcHrRZ5Mwf3pa8ijB8t8z9rCNvUNjI++QPDPfyFzc7B9+DrepMOoO3TBNHkqeSOVrRd1x26Y3/608LyQhWtxb9+E9bFJCJMJwwNPoYqJ9V37/VifuR+ZV/fXftXDjzFgxoeMP7IHe2YWqx56jKw9+2jYpxfDZv/IFw2bAnDR80+hj4jgupVLC8/d/9OvrHrwfyAEPZ5+nPA2rZWoz0OHWTr+DtK376ihdkK5ybmwmYMSKZ5VbCwW+BzFw1QlhDz9Ddfzskh1WdGXHtspTv7YDDwOUGlR6ULx2tNBpSV6gpX9WU4+3pbNcYun8BwBPNs7graR+nLn9dozyF3/OLaD3xeb9yS66P7Mb/Emt6+cyyf9hnNlbAsARv75IxkOKyuG3stLazIxagQv961HA5MaIQReKXF6JAZN3eyA+/L47pBS/uV7HYfSH2tdsTy+eCnlwWLnLAISpJSP+nJv+peadpaUcmx579k1xCRX9Ghdy5/kvyPQj+/Mci7343vdlk2Sx134Xe7eMFyuH1t5b0rtO79tllJW6u47FxFC5Eop/boDCCFypJShVZ3nnFvxHbfm88fRAxzJyyLWHEKKNY+rYltycf1GqE/Tv21xeVmSaCHD5uWPrYcI1qk4kqMsQowaQUNzFMHdXyFvw+PgdSlGD9CEtgKgWaia5lGJaPVeMu3Qt2Esw5tFE2lU7s4yiwUmHMnNollIOAAqQyRh/b+gy7hFNAz1sOqfP1Gb41DpgsnYsYYrGjXnumZtifQFO9zZphvzkvZh0jl4uW89Dme72HjcTqhehQrwAjFmDS3D6ybRXUrZtNTrYyg+94LXfgZXSjmk2PMBdaJYgAAXBOJC2MOrjJNCiJalbq5bovQUrDJnneGz2fM4mH6EE3YbQfoQPNKD2+Mk2qAjJjiaE+lHuFjr4LlrxgGwa/tyUvOTOZXlIMtj5N+0Y2S7PTQLDiPVmk2YTk/XiAa0i4pBqFR4pBeBQCIRCFRCoMNFhPoYKU4rK5L1RBg0ZDssRBv0dKkXjcMQhrXlnXh1DVEnvIdXelE1GkxQ50c5kpvFMxuXo1druLZpG1KsufyYtJJTrmbcFt+ZuYcOsys7ndeWLiHSoOenA7sJ0xkYHNsCnVbD3uwMXB7BsUwNC7NVhNsyaRumZkxca3oao0lIzaZhsAuNcDOsfhxDI6MJ9drg5HzMtkxkaFuEVyDt6Zi8+RhMPYHTy/kKECDAWUxBcMuFzZfAb0KIZ4DDKE1wX0FxdVaZmrg6AwSocwKuzjNLwNV55vBzdUZHyvUTKk+J0L72/fns6lQBjwITgTjgGIrRe1dKWeXSVmfdii9AgAABApTD+R+1WSE+4/aW73HaBAxfgAABApwLBFydAAghWgOdUfKFC5FSflnVOQKGL0CAAAHOAQQCcYGnMwghnkbJDd6O0gmmAImy/1clAoYvwFmNOqoeQffedqbVOG28S/170Z1LeO21mmL5n3NDxwZnWoXT5uOdpcrsCWrN1SmEiAC+AK4E0oGnpJTflyE3EMXQdAOySkd2CyGaAl+h5NAdBe4rSHeqIx4Cekopa5QUGVg3BwgQIMA5gS+BvbJH1ZgOOIEGwBjgIyFE+zLkLCgrqcfKmecHYCsQCTwD/FpODd/awoZSo7NGBAxfgAABApwL1FKtTiGEGbgOeE5KmS+l/BeYB4wrLSul3CCl/A4ldaD0PK1QVoIvSCltUsrfUKo4XVejz1kxzwFThRDRQghV8Ud1JqmJq/OczYPwWNNw5xxAExqP2lSyaK+UXjz5x1DpQlDpw2vtPaXXhSf3EEIfidpYlzdEQF1X8w0QIMCZoXZcna0At5SyeHPG7fhXVaqM9sBhKWXxXk7bfeN1xde+f+8oNibwtVqu6iQX1B6flBLrno/IXf8keGyAwNRmIiF9piJUGtw5B8leeTuuk+sAgb7J1YT2nYHaWLVcLCkl0m1BaMyIYn+groxtZK+4DXfWbgD0TUYS1u9jVIZ6dfApAwQIcH5S5Vqd9YQQm4q9/lRK+Wmx10FAbqlzcoBgqkeQ77zS89Rl09BmtTHJBWX47Ed+I3fNg8VGJNa9nxPc4xWk0JCx8DK81rTCY46kuTjihmBqM7HCeaWUytzrHsVrTUVljiWo4yOY2t+H13acjIVXIJ1FzcodSXNxtBiNsfkNtf8hAwQIcH4iqmz40itJYM8HSte7DAHyypCtiNqap8pIKZOgMJG9gZSy4n5Q5XBeGj4pJZ6cfUi3BXVoG1RaM1JK8re+CoDQhWFqdRvunH04ji0CwLrnk0Kjp4sZhFAbcCT/WaX3sx2cSc7KCYWvvZZkctc9gqntJCw7Pyg0evom1wBeHEfLaNMSoNbIzLNw50c/snT7PuoFm5k8Zjg39+vuJyel5KmZ8/ly2ToAJgzqxetjRyCEYFXCIYa/9kkJeYvdyc//u51RvTrX7QcwBaG+9WFE226Qn4NnztfIjSv8xESrTqiG3YJo3BKs+bifua3EcfXDbyBimirtijKO45n/HXL7ujpVXQSHYHziJbQX9UHmZGH79ENcf/3hJ6fpehH68ZPQtGqLzMsld/SQEsdDflqEiIgEj1KMw717G5ZH/69OdQcQIaEEPf0y2p598GZnY/34fZxL/L+vmm49MU24G3VrRf/sUVf6yRhuHIth9DhU4RF4TqSR9/j9eI/VsJJM7bg696N0SYmXUh7wjXUGdldznt1AcyFEcDF3Z2fALzq0thBChKE0t74ecAFmIcTVKJGez1Z1nvPO8LmyEshdfR/O46uUAbWR4K5PY2w5RnE1qrREjvgHbXhbAKUzAirsSfMACOryFEHdX0IIgfPURryW1ArfT3pdhQ1l1SEtFYOanYDt0I8A2I8qfeCCL36LoI4PAeA8uQ7pLL+r89mEryNDDBAjpUwvNr4V6AI0k1Im+sZeROnT10tKub6Y7G6gSamptYBWSlnrAVb3f/4rOo2a1M9fYVtiCle//imdmsbQPq7kfu5nS9cwb+NOtrzzOAK46pWPaFY/kkmDL6FfuxbkzHyzUHbFrgNc88bnDO7SprbV9UN9873gduF+/GZEbAvU972EO/kwpB0tKei0412zBDauRD1ktN88np8/Vs7xehFNW6N+6DXcz98BuVl+srWF8eFnwO0i55oBqFu2IWjKNPIO7sObeKiEnLTbcP4xB9eyRRjG3lHmXJYn78e9eX2Zx+oK86PPIl0uMof1RxPfhuB3ZuA5sBfPkZL6Y7dhXzAbsdSAcfydfvPoR1yHfsQo8h69B0/iIVSN4pB5pb2L1aSWEtillBYhxGzgZSHEHSjf45FAH7+3VFZWOpTvq/C1FPNKKZ1Syv1CiG3AC0KIZ4EhQCfqNrjlY5SWRE2ABN/YWuAdoMqG77yK6vS68slcPEwxeiotKnMceGzYjy3GlaE0sdRF9y80egDGlrcgdCG4MncAAnOHBwr353RRF2FoOrLC93RlbMdrTUFliKLeyLUEdXmCsAHfEDFkMV63BU/OflAbMLe9q/AcXf1e6GMvr/TzSK8H26GfyFg0lFOzu5O5ZJTPUP/nHAFuLnghhOgImIoLCOWi3Qpk+v4txNctOajgATREiRJ7pbYVtdgdzF6/g5duGkqQUU/fts0Z0aMDM1du8pP9dsVGHh4xkNjIMBpFhvHwiIF8s2JDmfN+t3Ij1/XujNlQdn/FWkOnR3S9BM+878BhRx7ajdy+DtXF/u1oZOJ+5PrlkF6OtyclEbwF5QslqDWIiDoMrDIY0fa/HPvn08Fmw7NzK67VK9ANHu4n6tmzC9eSBXhTk+tOn+piMKIbeAXWT6eCzYp7xxZcq/5Gf9XVfqLuhJ04F8/Hm3rMfx4hME68G8sHU/D4DL435Vgt9HKs1XSGewAjcBIlJeFuKeVuIUQ/IUTxu/JLUVII/gAa+54X78h9E0rH8yyU/pvXSylP1ehjVswg4AGfi1MC+N6vWkVxz6sVn/3QD3gtyWjC2hI+eD6a4Ca48xKxH/oJV6bP8DXs53eeQr72YgAAIABJREFUO2c/eOxoIjpVO+DEdWojAIam16LShxWO62MG4jyxFpDoonoiNKZyZgCPJRXn8X9BetCEtUYT2QUhVOSsuhPbge8AENog3Jk7caQsxdjylmrpWAt8h2LMpvpejwe+BV4tJtMPiEaJtvpQCPGwlNJZznyfoxSXfam2Fd2fegqNSkWrmKLvQacmMfyTcMhPNiH5OJ2axJSQSzh23E/OYnfw29rtzHmy7JVJrdIgFrweOFnUt1GmHEHEn17HDfU9LyLadkVodXh3b0ImHaj8pNNEHdcEPG68yUXuPM+h/Wg6n169ZNNzb4BK4DmwF9uMd/Ee2l/5STVA3dinfzF3pPvgPrRdL6rWPKr6DVE3iEbTPJ6gZyeDx4Nj0TxsX8yAmjYFqKUEdillJnBNGeOrKFYKTEq5ggqixH3engG1olTVyAHqAYV3e0KIxsVfV4XzyvA5UlcAYO78BJpgxbOmCW5KUJcnyFmjuBkLUgmcx1fjzvatlNUm3z8NAfBYTxS6StXmRuga9C73Pd2ZSvNxbZT/HpLXrtz4qIxK9QhP/jHFwAHqkOZoI7uS8+8kZRVXrLC4sfUETK0nYjvwHUIfScTgOWijLkY6MrDurVb3jdpiHTBOCNEWZX/gJuASShq+8cB84GfgQ2AE8FvpiYQQD6C4VLpVp5p6Vcm3OwgxGUqMhZqM5NnsZcqGmo1FcmYj+XYHUsoSUbm/r99BvRAz/du3rG11/RB6A9isJcakzYIwGMs5o2I8M14ElVoxfg3jav7DWxFGE9JSstqIzM9HmMq/6SsPyytP4dm/B4RAf/0Ygt7+mLxxI5H5dRY3gagl/VX1le+7tmcfcsZeiwgKJuSDz/CePIFj3q81UDDQgR3lprmgLZFKCNEbeA3FBVplzivD50rfDICuYV8AvPZMlPasID0OAIRGuZmxHf4Za8IMAMydlKIEQqtE87qzdpO9XPHsGZrdUKHh87qUL6JKHwlA7rrHsCfOUc6NH+ub11yoX/YKxQtobD0BdVBTbAdmoo3qSUivt1GbY3FlbMOTfxR70nwAgjo+hK5+L2UeQz2Cujx5mlenxhSs+lYCe4DCJYkQwgTcANwqpXQJIX71yZYwfEKIXih/pJcX3y8sjRDiLuAugMb1qpdLGWTQk2staeRybXaCjYZKZXOtdoIM+hJGDxSX6Nj+F/mN1wXSYQdjyR9aYTBBsWbG1cbrQe7ehOqykYhTqcgddbRvZrMizOYSQ8JsRlqt5ZxQPp5d2wqfO2Z9ge6qq1F36oZ7zcoaq1kespb0lw7lb8o260tkfh4yPw/7nJ/R9elXM8MHoLrgDd8UFHfrdJR9xy+BT4APqjPJeWX4pEtxTRe4K0/N7obXqvw+m9reDYDXVcYGs0pb4vzyeH3rKjanp9G9XjRPdVVcpgUFA6T0KPM70vHkJxZoVOG8tgPfAhA28Fs0IS0AUAfFAZC5VNkf1kR2BcB6YCbu7D3KZ2lzJ5rgphXqWgd8B/yDkkfzbalj1wJulH0AgFnAX0KIqAJ/vxCiHvALSk3ACkMLfTlHnwL0aNG4WkuUVjFRuL1eDqSdIj5aWd3vSEyhXVxDP9l2sQ3ZkZhKz/gm5codS89i5e6DfDTpxuqocfqcSFZ+3OrHwEklsErENkOm1kJfOZUaERVdZ5UnPMeSQK1BFdsYb7ISiKNu0RpP4sFKzqwKss5b8niO+uuviW+N53D19PckJSKdzpKr69q46CLQgV0qDWQ/oJqGrjTn1VUs2EcrSB/QhDQDtXKnX+Bu9FqVPZygzk8UrvRUulAAPDblmDaqO+GXl7wz25V5kmc3/c3B3Eye3fQ3uzNPKu/pW+kVuDUNzW9CHzvYN6+y5+f1zauL7k/YwJmKjtKLJ/cgQhtcaPTS5w8gff4AMv8cSYFbvWCVYU/8Hcv2N7FsfxOv5b8PCPDlzxwBhgKzSx0ej7IvcFQIcRzFwGmBW6AwMux7YLWUcip1iNmg59qenXjxxz+w2B2s3nuYeZt2Mba//z7TuP4X8f6Cv0nJyCY1M4f35q9g/ICeJWRmrtxE79ZNadHwPyo24HQgt65BPWKcEujSoh2ic2+865f5ywqhpCqoffevxZ83iEW07wFanWLweg5ExHfAu39n3elut+H65y8ME+4FgxF1hy5o+w7A+eeCsnXX6RSdC58ruov6DVF36KK81unQ33QbIjQMz86tdae7T3/niqWY7rwfDEY0nbqi7XcZjsXzytdfrQGKfRYAhx3nskUYx04AkwlVVAMM11yPc3UtrFZV6sof5zFCiMvKe1RnnvPK8GkilPyqApdn5PC/0dW/GABtRCcAnGnKH5/aHFNYrkxliga1EXfmTrz2TFS6ULT1upaY+6kNy4gymFg27FaiDCae3KD8EBXIOdOUPUFD3GC0DZSoYHVwUxAqnKc2It1WVPpwNOEF1XwEQhuMdOUj3YorRRveHnfmTlynNqANb+f7LMqXPaz/l5ja1n0eUyVMBC6TUhbfCGmEEmk1HCUsugtKLs8UiqI7X0TplvwfRIfAtDuvx+Z0ET3xOca+/y3T77yB9nHRrEo4ROjYxwvl7rqyD8N6dKDLo2/S+ZEpDOnejruuLBnRPXPlRm4tZQzrGs8P00CrQ/PWj6gnPoHn+2mQdhTRsj2a94vuOUR8B7TT5qG5/xVEZAO00+ahfnCy76BANXwMmrd+QPP2j6guuwbP52/AMf8gn9rE9u5khF5P6NwVmF+YgvXdyXgTD6Hu1I3QxUULfU3n7oT9tYmgt2agahijPH9HyZsUJjOmR58ldOFqQn77C83Fl2B57J5aiIqsHMvbryL0eiL++Iegl97C8tYreI4cQtO5GxHLNhbp36UHkSu3EvLeJ6ijY5TnH3xabJ7JSKuViHkrCPlsFo4lf+CYX/p+sZqIWo3qPFf5otRjHrAYZe+vypyVrk6vMxd3zj6kIwvpsYNQK8Yo6iKEWo+Ukm0Zx3F7vTQJDqO+UfHL66MvxZE0h/ztb6GL7l+4kgPQFBi+E6txnlxXuG8GirtSG9ER16kNWBKmE9ztuRL6/JOWxIKj+wnTGRi08FscHg8Lju7nn7Qk+kQpEV/2pHm4shIKDRaAUOtRh7TCk7MXS8InBHV6uNh7CgxNR2I7MJP8bVMI6v4ioX2n40hdjnRmo28ygvxtr2PZ9QG6mIFooy4qdMmeKaSUZf1q9gO2SSmLhzgjhPgQeFQI0QElv8YFHC9jn6ydlPJo6cGaEBFsZvYT/ja2dG6eEIIp465myjj/cPUCdn/4dG2qVjWs+Xg+9s/0kAd3435oVNHr/Ttx/d8QPzkAjh/DM+Xhso/VITIvF8szD/mNe3ZsIeeqou+ce9smsi/tVOYc3sRD5N1+fZ3pWBEyN4e8Jx/wG3dv30LmoKLoTvfWjWT0Lr8kpbRayH++vIYGNSDQgb1EyTIhhBrl96VaUU9nneGzHf6F7L9vxdD8ekJ6TsFrO4Hz5FrsGdsxCTWZwR2ZuHIeR/KyGBzbgm8P7GD6JUO5qWUHjC1vIW/rK7hOruPkT63QhDQvXDGpTQ3Q1r8Y18n1ZCy8HEPcMFzpRbldhqYjcZ3aQP6Wl5U0BE9R0MPne7dwUVQMc68cg5Rq8l1ublo+i48SNtPvsmvRRHbBnbGN9Dm90McMxHWq+LxXY9m+l7wNj+NMXY7XWZQ8HNTlGRzHlpC/7TVsh39GZayPJ+8IKn04uqiLMMaPxXZgJhnzLkFozEh30ULLYz2OJ+8I0mNHqPWAAOlFE96udotrl+q/VWzcTVGY8xtlHE9FcXfCeeZZCBDgjBCI6vRDSukRQkwGkoF3q3reWWf4DE2vJfyqhew+spLUDZ/gQEe7ID2NQ1vhUgfz4/b5XGlIp2dsQ4TGy+BubViVvJnWejdd47oQccVssv+5E0/uQVynMkGoMba6HVRagru9SObiIeBxYE8s5jLSGDG1uQvLrql4bcdxpiwtdkzPkPqDMLttzN7voFe0EZNWw7f9x5Pr8CKEILT3e2Qsugo8NhzHfPEdKh1qY33M7R/AuvczpCMLR3JRU1KhNqAJbUnkqK3s3ToNbfpatEJLvU6PYmyuBFKE9vsMR/0B2A98g8qZS1B4K4wtRiOMDchZdReodGha/x8eWzqerF2QdwBjk+GYm9dl4YQAAQKcMc7zPbzT5AoKwveryFln+I7m23nniCTT0YMhcS1xeFxMOX6Edpr63KFrwATVTLzqdAyhLUGocWcmcLF7A9qsTKzR7fnDGsHSBpPpH53K0bwMmjTuz5D4XgiVGn3s5YRf/gvZqyYhHZmoDFEEdX0GfWOlPmPk8OVk/z1O2SNUaTG2uImQXu8Sd0pL0xA3UkKG3UOuU1noeH2RWtoGl+AavAKx7WWwJqOP7Ex450fRRnQAlL3G7OVjcWftArUeU/x4gnpMZlnKYd7avoYwXSeGtr2O2Uf2ILIET3vq0c7l5O0da5iRcIrrmz3K2hPJGPO1vGXoRlt9DPMbfUdSnp02NkGEQYWs35vIODP1jFrM5V3cAAECnMMIqF7bufMOIcQxSsbImgADSiWaqs8j6zKhNUCAGtKjRWO5/s1Hz7Qap4136eLKhc5iLAkplQudxXhcnjOtwmkzaOdhtuXbCjf1erSIk+vfqHzfVnPjo5sr6c5wziKEKN0z0ALsl1JWqxDqWbfiCxAgQIAAZRDY40NKWSsVDAKGL0CAAAHOFQKuzu+oQjkAKeWtFR0PGL4AZzc6PTRudaa1OG1UN0aeaRVqRNCWuu3fV9fsm3Hu9r50uUrHa4hAcAtkU1QXOAmlY8QI4Bsgo6qTBAxfgAABApwLCEB1YefxAa2AYb4uEgAIIfoCz0kpB1d1koDhCxAgQIBzhcCKrxdKt5jirAfK7yRQBhe2wzhAgAABzhUCJcsAtgKvCSGMAL5/JwPbKjyrFDVZ8Z2XeRBvbltNltPGXW260yyk9iqgVIZ025DSg0obVLlw5Vzw/pAAAc5LLvDgFuA2lIL3OUKILCAc2ASMqc4kF6Sr0+b2kpjjQq9W0TRUg8pX/87t9fLspuW4vF7ah9cv0/C585LAY0MdEo+ogtuhUaNG5R5LSUlBehzkbX4Ry853QXrRRfcnpM9UtOFtT/8DBggQ4DwkkM7g6/jeRwgRB8QAaadT6/eCM3xrU218tzuXbIcSMRUXrOGRHuE0MGs4kpeFy6uM78su2SfVa88gZ+1D2A/9CIDK1IjQS6ZiaDKiRvpk/3Mn9kM/FL52pq0kf+tkwi+bWaN5L2Qyc/K489X3Wbp+C/XCQph8z23cfNVAPzkpJU9N+4ov5/4JwISRg3n9vtsLW0Fpeg7FVKwx7egrLuXTZ/0LMNe6/rn53Pn25yzdvJN6IcFMvuNGbh7Ux09OSslTn/3El38oqU0Thvbn9TtHF+rr8Xh58Zvf+HrxP+RZ7bRs1IC/3nmKsKC6q+2TabVz18/LWbr/GPXMBl4d2pubu/pH5a44mMyrf21ia8opwo16Dj5dMvo8MTOXO35ezoajJ2gcFsQH11zKoFZxdaZ3AerQUGLemEJQv364s7I4+dab5Mzzb0tk6tWL+vc/gKFDezw5uRy4tF+J44a2bWn44ksY2rTGm28h64cfODWthh25RCCqE0AIEQkMAKKllG8KIWIAlZSyyv3aLijDt/2knalbskuMHctzcyjbRQOzhr3FjN3e7KLIWCklWctvwZm6vHDMa03BuvezGhk+V+bOQqOnMkWjCYnHeWLNac9XE4QQiSjlf5oVtB0SQtwBjJVSDhBCSMCK4uJ2oPjUP5VS/lRsjhUom8/uYlNfIaVcK4R4BbgGaAu8KqV8sa4+y/1vzUCn1ZC6+Hu27T/M1Q+/QKf45rRv0aSE3Ge/L2LeyrVsmTUdIeCq+5+hWUwDJl03rFBmy6zptIyLqStVy9b/w28U/X+dzraDSVz9zDt0atGY9k1jS+q/4G/mrd7Mls8mI4CrHp9Cs+goJo0YBMCL3/zG2t0H+HfqCzSuH8nuxGQMurrt8PHA7/+g06hJeeF2tqWmM/LLhXSKjqR9w5JpHSadltsuasvoLvFMWb7Zb56xs5bQq0lD5k8czqI9SYz+bjF7nhhLVJCxTvWPfvllpMvFvp4XYWjXjsZffIF9zx4cBw6UkJNWG1m//IJq/nzq3eNfLavR+x+Qt2QJiTffhDY2lmY//4J9zx7ylv1VMwUv8O4Mvsotv6G4Ny8B3gTigf+hpDVUiQvKYfzDHqVzhVEjuKVtMLd3CKFRUJHt3+czdvUMJvblFBlB5/F/C42eodkNhPR6F1106co5ZZOSkkJKSorf65SUFOyJcwDQRHYh6rodRA5fRtT129GExtfsg54+auDBCo53llIGAa2Br4FpQogXSsncJ6UMKvZY6xs/CDwO1GlilcVmZ/by1bw0aRxBJiN9u7RnxKUXM3PRcj/Zbxcu4+Exo4htUI9G9evx8C2j+GZhDX+YaojFZmf2qo28dNt1BBkN9O3YmhG9uzFz6Wo/2W+XrOLhG4YQGxVBo6gIHr5hCN/8qUR5Z+VZ+PC3P/nk0Yk0aVAPIQQdmsVh0OnqTneni9k7D/Hi4IsJ0uvo2yyG4e2aMmvLfj/Zno0bMLZ7a5pHhPgd238qm60pp3jhyp4YtRpGdWpBh+hIZu+s216CwmgkePBVnHzvXbxWK9ZNm8j7axmh117rJ2vbsZ2cOb/jPFa2l00XG0vO3Dng9eI6ehTrpk3oW9Xwex0IbgF4HxgtpbyKohvs9UC1mmZeMIYvw+bhaJ5yne7rGsbwFkFc0dTMK30jaRKiGL99OemohGBQTDMO5GTi9dUxdSQrreYMzUcTPuh7zB3uJ2LoUswda9bvzJWhBCKZOzyISq90a9eEtiKoVD/A/5C3gP8JIcIqEpJSpkspvwPuBp7yuR4qREr5jZRyEdXsm1Vd9h9NQaNW06pJ0eqoU3xzEg4n+ckmHE6iU3yzYnLNSDhc8ods4KTHaXTVGK5//FUSU0/UneI+9icfV/SPiy7Sq0UcCYn+XpyEpBQ6tWhcTK4xCYnKTdbOI8fQqNX8tnIjja6/j7a3PsaMOUv95qhV3U9lo1GpaBVV9OfTOboeCcczqzVPwvFMmkeGEmwoMtKdouuRcKJ681QXfbNm4PHgPHKkcMy+Zw+G+OoXUMj46ktCR40CjQZds+aYunYlf7X/zUu1qaUO7EKICCHE70IIixAiSQhxSzlyQggxRQiR4XtMEcWaagohpG+OfN+jWg1hT4OmUsplvucFAZZOqum9vGAM34EsJwBhehWd6+sLxw0aFY2CFffPvuwMmgaF0S48CrvHzdF8peNzQUd3Y/y4wvOEEOhj/PeNqoPbZ/j0jS4vMS7OXOTWJmAFitugKsxF+YP7b1uUV0C+1UaI2VRiLDTITJ7V5i9rsxNabL8rNMhMvtVGQeH25R9P4dDcr9j9yyfEREUw8pEXcbvrtuhxvs1BiKmkOy/UbCLPZi9D1k5osc8aajaRb7MjpSTlVCY5FisHktM4OOtdfnrhfl7+9neWbtpZZ7pbHC5C9CVdqSFGHXkOZ7XmyXe6CDGUXJmGGnTk2V011rEiVGYznvz8EmPevDxU5urvieYtX07IVUNol7CH+GXLyPrlZ+w7dtRQQ193hsoeVWM6isFogBIR+ZEQoqzOunehbFF0BjqhuBMnlZLpXMzD498BunZJEEKUTlS/HKjWH/YFY/gSc5UvTfMwbWEUZ2n25WTQMjSClqERymvfnp87axcAuqjutaaPlBKPJUXpLm9sACh7fq7MnbiyEmrtfU6D54H7hRBRlQlKKV1AOhBRbPhDIUS277HldBQQQtwlhNgkhNh0KrtaRdcJMhnJtVhLjOVarASb/PeGgoyGErK5FitBJmNhcMil3Tqi02oJCw7ivUcmcST1OHsSa7VZfBk66cktZaRzLTaCjYay9S8mm2u1EWQ0IITAqFcMx7O3XotRr6NTi8aMHtiLRRu215nuZr2WXEdJ45RndxKsr557NUinJc9e0ljmOpwEG+p2f9JrsaAOKplOpAoKwmuxlHNG2ahDQ2ny1decmjqVhLZt2NenN0H9LiV87NiaKShAqNSVPiqdRggzcB1KtZN8KeW/wDxgXBni44F3pJTJUsoU4B2UlIIzxaPALCHEN4BRCPEJyrZLtdrdXzCGz+pS7uJDdGV/5CyHjZM2C/EhEbQMUX7HC4JdpDMHEAidkt6QsWAQ6XN6kT6vb+H5adY8fjy4izmJe6umkNcF0oPQhRb+0KbP7kb67G5kLlSCE6TXjSNlOXmbXyRn7cPkbZ2MI7lu96CklLuABcCTlckKIbRAFFDcB/WAlDLM9+h2mjp8KqXsIaXsERXmvwdUEa0aN8Lt8XDgaNG+6o79h2nXvImfbLvmTdhxoMittePAEdo1b+wnV4AQgrru4tUqtqGif/LxIr0OH6VdqcAWgHZNGrHjUJEh3nHoKO2aKukzHZsrEZDFb/HqOi6iVVQYbq+XA6eKAsi2p2bQrmFEBWf5065hBIczc0sYvx2p6bRrUL15qovjyBFQq9E1bVo4ZmjbFvsB/z3KitA2bgxeLzm/zwaPB/fx4+QsmE/wgAE11LDW9vhaAW4pZfEPth0oa8XX3nesIrl/hBDHhRCzhRBNq/55qo+Uch3KynM38CVwBOgppdxYnXkuGMNX8J0v73erILClZWgE8QUrvpyCyM6SvxiujK240jfjSi9a0Lyx7V9uXv4b1y75iS3paZUrpFJc0tJtLfOw15lDxry+ZC4ajDNtFUgv7qw95Pxb2stQJ7wA3AmUn4SoMBJlg3lDnWtURcxGA9cO7MOLn87EYrOzevtu5v2zjrFDLvOTHTf0Mt7//ndSTqaTeiqD92bNZvwwxe28+1AS2/YfwuPxkG+18dgHn9MoKpK2zeo2pN5sNHBt3x68+PVviv679jNvzRbGXnGJv/5X9uX9XxeTciqT1PQs3vtlEeMHK2H1LWIa0Ldja16fNQ+H08WepBR++nsdw3p1rTvddVqu7dCcl5ZswOJ0sfpIGvMTjjCmm/8emdcrsbvcuLxepFSeO31u5FZRYXSOqccrSzdid7mZs/MwO9MyGNWxRZ3pDiBtNvL+/JP6Dz+MMBoxdu9O8BWXk/P77/7CQiB0OoRGU/Rcq6xInUeOgBCEXn01CIGmXj1Chw3HvreKN8UVUTVXZ70Cj4nvcVepWYKA0q6UHCC4jHcM8h0rLhdUbJ+vP9AUaAOkAguEEHWSLSCEUPsixzOklG9KKe+VUr5RnTSGAi4Yw2f2rfRyHGV3qC+I4jyQk8HvR/aiU6kLjaHQhQAS6cwCwBh/K0JfFM+R53Tw1b5tXN+sHUFaHVN3VW4HhFAhdGHgseN1Ka6UkD4fojLUA8Cyewau9M2YOz5M5PBlhPb5gPDLZhJ1Yy18eSpBSnkQ+Al4oGzdRYQQYgzKPsEUKWWlVdGFEFohhAHlb04jhDAIIeokBG3a4/diszuIHnwzY599k+lP3Ev7Fk1YtXUXof1HFcrdNWoow/r2pMst99D55rsZcslF3DVqKAAnMrO45ek3CB94PfHXTiAx9QRz330RrabuM4CmPXgbNoeT6OvvZezkGUx/8DbaN41l1Y59hA4r2kK5a/hlDOvVhS53Pk3nO55iyMWduWt4kYGf9cw9JJ1Mp/61d3P1M+/w0m3XM6hbWTf1tcfUUf2xudzEvPgl475fwrRR/WnfMJJ/D6cS9swnhXKrjqQS/PQnjPhiAUez8wl++hOGfFaULzdrzJVsTj5J1POf88yitfw07qo6T2UASHv+OYTeQJuNm4j94APSnnsOx4EDmC66iDY7dxXKmXr2pN3efTT56mt0jRopz7/5FgBvfj7H7v4/IidMoM22bTRf+AeO/fs5NW1azZQryOOrPLglvcBj4nt8WmqmfKC0KyWEsgPPSsuGAPnStxEupfxHSumUUmajRIQ3Q0lZqnWklB7f/DW2WzXpwH7GS5ZJr3KHWBW/9sY0O+9tziJIK/joigaoS1U5f3rDMl7f9m+JsUbmYJLHPELm4mE4kpcQPngBhjhlX/XU7z1xZ+0ieoKV6bs3cN/qRawZOYGv923jmwPbOXbLw0QZizbFCyq4FE9tyFhwGc7jq4gYuqQwUObkz22Rzmy09XvjODqfiKsWoY+9nNz1j+NM+weA0H4fo43sUtHHrbZTy5fHd4eU8i/f6zjgALCujDw+J4rL4zMp5ffF5lgBzJRS+kV2CSG+RtkvKM7tUsqvK9KrR9t4uf7bD6v7cc4e8qrcKeWsRAbaEp0xbkxJZbfDUdSBvV0ruWHWjErPU3e7osIO7L49viygvZTygG/sWyBVSvlkKdk1wFdSys98rycAd0kpe5UxrxplRdhHSlnTSJ7ydJ8AXIrilUqmmB2SUpa9qimDcyqBXbpt5K5/DNepTeibjEBliFJchV4Xuga90TXsi9vr5dUt/3DMkkO43sgrPQZi1GiJD1fcEPkuyYbjdnrHKHePuU4vGTZP4equOCmWPPKcDjSR3XAkL8F2cGah4SvAKyVTd21AAJO3riLVkofD4+HzvVt4qms/vzmLo4nsjPP4KmyHfvSLENVGdMBxdD7OUxvRx16OMf5WPPnJ2I/8gnTllzPj6fP/7d13fJRltsDx35n0QgIkoSV0QSSI2EVREV3EAqjo6iquq9eyuOpdy666q17xrnfXtpa1LHbssooiKiqoCCgWlCI1kBAghNAS0uvMuX88b8ikkYSUN+M8389nPpl532cmJyHMM+9TzlHVAXUebwMi/R432Zmq6tgDnPsd7k6KW1ZgE2mTskSqWiwis4H7nCQVozDTFvXTA8ErwC0i8jGmk7kV+JcJR1KBMMyKyijgb8B2YF2rg2xc9Ydq/4U44sTW7BGkwOr4fBV4i7fjK8/FE9EdT3g8FXuXU7L+eSL6nkMXxGu7AAAgAElEQVRk4vFcsuAdvtqxhX+OHs/dy75k6c4sPjjzEhIiozmkaxib9lXy9PJ9ZOyrJCJE+HJrCZcNj2N9/h66RURy0cDhAKzM3cl3u7aTlr+Xw/ueSfHKf1CW/hZ5vkrCEo/CW5QJwILtGWzI38t1hx3N6J5mAcJDK7/hmbXL+NMRJxHqafyqPKLPaZSseZLSDS8CENZtBL7SHCQkkujUGynNmEXRT/dStW8toV2H7/+elmUFqbZLWXY9ZnHILkwB12mqukZETgbmOYkqAGYAg6jZLvC8cwzMVohngBSgGPgGONdZ7d1eBjbdpGkB1fF5wuPpPv495mau4/+++5BDoquIDR/Fg7/JJjY6kTfXfMkgyeWvxw4mMjSPj0Yfxays7by1+nOuP/pcLhsex/8u3YtX4aOMmiXKXvWxKT+XsX0GMOMUk/XmjU0/c9kXs9mQv5ejBp9EePIZVGxfQNnmdynb/K55YkgEgofbRpzBkd0OpU9kNNFhwq0j4libt4OM/CKGdmt8VWJE37MJiR+GN389pRtepHphuoREEhKVROL5yyjLnEPFjq+oyltNWI/jiRnxR8ISa7ZVNDSEalnWL1Hb5epU1VzM/ry6xxdjFrRUP1ZMxqU/N9D2C0wWp3YnIr1UNUdV62eiOAgB1fH5VPm/5Yt5ZNVSLj1kBId378m/1y3jxI/f4f0zL2ZCRC4nRWyiwteTyJAoPIUbuDFkHVrqw+c7k0O7h3Pn8d15aXU+2UVmfjA1MZyYiBIqfF6GxNUsl66+v2HfHkSEbuPeIP+b/96fWzO0Wypxox8j0teTkuJo1vlgZFIIceEeesf05cSeKfSMPvAfqXhC6Xb6G+Qt+DXegk3mWGgMXY77OwCesFiih1zGkLHmb666c8sr8/LB+nxyir28vnQjidEhrNtbTqXPbNDvF9e++50sy3KBEMxlidLwW2QjIrNV9YIDtD+ggOr4KrxexvVO5bSeR3DleROZV17C0qVL2VFSiPjCyO42kW0hZ5EQ5aFXTCgkHoviZFX2+diWv4O95SVcPCySXaVVhHqUYd3CKPfCX0aNYXxKzXLpYV0T+cuoMYzs3hOfKt6wOGJOeYmx0z7j2++/JySmD6GeEE4s2cGxw1ZSThRl3t5UFFTgqywiXkuIijqKosoYiisrWL4pjRARckqKiAoNJSYkzOwLix/OSbfvY+ywLjzy+BNEpPyKsKge7CmtYvO+cqp8Phau3oRHIH1fKWEeDwlhlZybsJ78mCKqPD3RMqGyoog4TxWJEQMx1Tosy/plEeSXn4uzMXUnN8e25sVa0/F1eJrwyNBQTuxtthGk/bBo//GeMTVXaiMSI+o9r9qh3Xs3el1+/3Gn13rcJTyi1jGPs/J+5fqcWu2I6UNITB8igfgGXjcciA1rPHNFCLBxc/19f4lRoSRGNfbPE0FM1PE0mSDTsqxfDiGYyxK16S6CgLris4JQdUb6QBVzwHzfnZ4cU3/jfCCpqvrQ7RAOXr23egnmoc5QETmNmguuuo+r5xyb92JtHJxlWZbVXgL5Q2Dr7MKsQq22t85jxaw+bRbb8VmWZQWCIK7AXnefcWvZjs+yLCtQBHkF9rZiOz7LsqyAENRzfG3KdnyWZVmBwnZ8baI1v0W1t057C2q5+QVMuXU6cSdNYtA5l/PmvIYXe6kqdzzxPD3GXUiPcRdyxxPP76++DhB69JnEnTSJ+DGTiR8zmWvve7SD4i9kyu33Ezf2QgaddxVvfrqw8fiffJke4y+lx/hLuePJl2vHf8JE4sZeSPxpFxF/2kVce3/7J/vOLShiyt3/JO6sKxl0yU28ueDrxmOf8SY9Jl9Lj8nXcseMN2vF7vX6uPuFWfS98Hq6nn0Vx1xzJ/uKWlYQ9mCExMcz4NkZjFi/lsO+WULXyZMabBczejSD33qTEatXcdjXS+qdjxw+nMHvzDLnv1tKj5tubH1wghnqbOpmNcle8QU5pypDT8CLKUHyCXCDqhY550/EJJ89FvABi4DbVXWtc34s8AU1lRuygX+o6kt+38O/skMZMB+TG7CmYmkbuvGBpwgPCyV7/tus2JDOpP++m5FDB5E6eECtds/N/pgPFi7lpzefQUSYcP2dDOzTi+suPHd/m5/eeoZD+jZVlrCN43/434SHhpL98ausSMtg0q33MXLIQFLrFNN97v1P+GDRt/z02hMIwoSb7mZgn55cd8FZNfG/+gSH9O24hAY3Pv6SiX32M6zYlMmkOx9i5OD+pA6sXUj3ublf8MHXy/jp+b+b2P/0dwb2TuK6SaYe4r0vv8PSNWkseXI6/XomsiYzi8jw9s9IlPy3/0UrK1l71DFEpQ5n4EsvUrpuHeVpG2u185WUkDtrFvJBJD3/8Id6r9P/X4+T/8mnpP/6EsL7pnDIu+9Qtm4dBfNbWUjadmxtwl43WwATnaS0o4AjgTsBRGQ08BkwB5MOZiCmHNHXIuK/dDjbeX4ccDPwnIjUzRVwhNNmENANuLc9fpDi0jJmf76E6dOuIDY6ijFHjmDiqaN57aPP67V95cP53Dx1Cik9k0jukcjNU6cwc+789gir2YpLy5j95TdMv26qiX9UKhNPPo7X5n1Zr+0rH3/BzZeeR0qPRJJ7JHDzpecxs4Gfs6MUl5Yxe9H3TL/qImKjIhlz+DAmnng0r81fXK/tK58t4uaLziYlKYHkpO7cfNHZzPzEJKXIKyziiXfmMePWa+jfKwkRYcTAvkSGN54Ioi14oqKIP2sCOx5+BF9JCcU/LCN/wQK6X1A/M1bpypXkzX6Piq1bG3yt8JQU8t5/H3w+KrZspfiHH4gcWr8gb8sIeDxN36wm2d9SG9GqEkrWP0fBD3dTsv4FfGW5bofUYqqaA3yK6QABHgReUdXHVbVQVXNV9S7gWxrouNT4GMgFRjbyPQqAD4Dh7fAjkLYli9CQEIb2r7nCGDlkIGsz6ue2XZu+hZFDavrvkUMH1Wt32tW3kTz+Ei687T4ys3PqvkSbS9u6ndAQD0P71Vxlmvjrv8GuzdjKyEMG1m63uXa706bdSfLZl3Ph7f9HZvbO9gscSMvKMb/7vr1rYhrcj7WZ9Qtkr83MYuTgmivYkYf039/u54xthIaE8O6i70i+YBqHXX4LT7/3WbvGDhAxaBB4vaaCuqNs7Toihg5p8WvtfuFFuk+ZAqGhRAwaRPRRR1G4uP6QaMtJM25WU2zH10zqq6I0/S1yPzufvfPOpuD7v1BVkA5A5Z4f2TVrGPlLrqd45T/IX/J7dr09BPW1vjrH5oI8Zqxdxqz0Na1+raaISApwFrBJRKIx9bn+00DTWcCvGni+R0QmAYnApka+RzdMVvh2qXBaVFpKXGx0rWPxsTEUlpQ20LaMeL+28bExFJWU7p9r+uK5h0n/8BXWvPs8fZK6M/mP91BV5W2PsGvFFBdTJ/6YZsYfUyf+Z/5O+nvPs+btZ0z8t93XrvEXlZYRF127Snp8TDSFJWWNxB7l1y6KotIyVJXtu3PJLy5h47YcNr35OG/f+0fum/ku85f9XO912pInJhpvYe1al97CQkJiYht5RuMKPv+c+LPPYmTaeoYt/ILct2dRuqoNarOKp+mb1ST7W2oG9XnJ/XQS+768nPKtH1KxfT7Fqx6iYOnNqK+KvC8ux1eyg5DY/sSM+G8i+k1Eq4rg4Kvb7/f6pp/5/ZKP+O3C9yirqmqDn6ZB74tIIbANkyHhf4DumL+P+olEzbFEv8d9RGQfUAq8B9yiqsvrPOcnp80eoB81Nb3qEZFrRWSZiCzbnZffoh8kNiqKgqKSWscKikvoUucN2bSNpKC4pFa72OgoxJlHOeWowwkPC6Nrl1gevW0am7fnsG5zw0NbbaVuTE3HX9MhFpTUif/IETXx33wNm7N3si5zW/vGXqeDLigppUt0ZMNt/WMvLiU2KhIRISrCDGne9dvziYoIZ+Tgflw8bjTzvlvRbrED+IpLCOlSu5MLiY3FW9yyws8h8fEMemUmOx9/glVDDmXtcSfQ5ZRTSLh8ausCFDvU2Vbsb6kZStPfoGL7fDxRveg67g0Sp6wgfszThMSkULn7e7wFGwmJ7UfilBXEnfAw3cfPJmHS123y6eurHWbordzr5fvd7VZz7zxV7YLJeD4M06nlYRaz9G6gfW9MB1YtW1W7Yub4ngDGNfCco5w2kZjilYtFpP47IqCqz6rqMap6TFK3hlJ/N25o/xSqvF42bq35Xa3amMHwOgtDAIYP7s+qtIyadmkNt6smImg7L5od2i+ZKq+PjVuza+LatJnhg/rVazt8UD9WbawZllu1cTPDB9ZvV01Eaq2cbGtDU3qZ331WzWelVZu2MHxASr22wweksCq95kPEqvSt+9sd7vys4reQQzpgCK88IwNCQggfMGD/scjhh9Vb2NKU8P79UJ+PvHdng9dLZU4O++bOJW7caW0QpR3qbAudtuNbuTeHk+a8yJg5L5JbVn+YpyOVb5kLQNzxDxA16CLCuqUSPewa4k56kvLshQBEDbkCT1jNp8XwpGMQT/MWzZZ7lfV7K1i+s4xNeRVUeM2bU6XPyzc7t3Fqb/Nm/NWOzLb6kRqkql8BLwMPq2oxsBS4qIGmvwbqraJQ1XLgduBwEalX5NJpU4mp4jwQGNE2kdeIiYrk/HEnce+/X6G4tIyvV6zhg4VLmXrO6fXaXn7OGTz2+my279pD9u69PPraO1wx0YzgrknPZMWGdLxeL0Ulpfzp0WdJ7pHIYQMa71jaLP6xo7n3uddN/CvX8sGi75h6Vv03zcvPGsdjb77P9l17TfxvvMcVzs+5JmMLK9IyauJ/4kWSkxI4bGDf9o395GO596V3TOw/b+CDb35k6q9Orh/7+JN57D8fs313Ltl78nh01kdcMeEUAAYn92TMyGH8/bX3Ka+oZN2W7bz95VLOOeHIdosdwFdaSv4nn9Lr1lvwREURfczRxP/qV+TOnl2/sQgSEYGEhpl95RERSJhZdVqesRlBzFYIEUKTkuh67rmUrlvf6hjF42nyZjWt025neHnDCr7dlYVPlVkZa/j98GNci6Vi5zcARPQ1y8R9ZXup3i5XXUA2tKtZxFieNZ+qfetM+/4T6T9sTKOvuzFzGy+vLuDbHaWoQkJUCEWVPiJDhCfP6MmPu3dQUlXJJYNHkF6Qx6Id7TvM5ngMyBSRI4A7gE9FZD3wEubv5VZgNGZ7Qz2qWiEijwD3AO/XPS8iIcCVmGHRjLrn28KTd9zA1dP/Se8zfk1CfBxP3XkjqYMHsHj5z5x7413kL5kDwLVTziFjew6jLr4OgKvOO4trp5wDwM69+7jhH/8ia+duYqIiGT1yOHMeu4+wsPb/L/Pkn6Zx9f2P0/usqSTEd+GpP08jdVB/Fq9Yw7k330v+l2ba9drzJ5j4p95g4p84nmvPn2Diz93HDQ8+Q9auPSb+w4cx5+F7CAtt3/if/ONVXP3gDHpfMI2EuFie+uNVpA5MYfGq9Zx7+wPkzzO7XK6deDoZ2bsY9V+3m9jPPo1rJ9Z8OHn9rhu45qFnzT6/bnFMv/IiTj+6zT8n1bP9r3fR9+GHGL78R7x5eWT99S7K0zYSc9yxDJz5MqsPSwUg5vjjOWTWW/ufN3LjBoqWfkv6xZfgKyoi87rr6H3nHaTc/zd8ZeUUfL6AnU/8q5XR2cwtbUVaMfTRbmMmVT4fya//k+OTktlYkEu3iEi+mfxftdoUVpTjVSUmLIywNkrcureshN98/i6VPh9/SD2WCweZhYc7XooDXzm9ripDRMiZ2R2tLARPGJEDp1CW/hbdzniHyAGT2ffVlZRufA2Abmd+QKTTWSYnm1V61VXUAZ5duY+F20o5IimCPxzZldhwDz5VMvMrGdQ1nAdXfM3t3y9g5ZTfc//yxXy4NY19v7u9OT9vs8c7nH18V6vqAr9jzwA9VHWKiIzB7OM7BjP0uRizj2+103Ys8Jqqpvg9PxrYClypqnPr7OPzARuAu1T106biO2b4UP3utSeb++N0PpUVbkfQOqWFbkfQKqt/e6fbIRy0S3N2sKaifP//5WOOPEKXffFJk8+T7n1+VFX3rhQCQKf8+PBZVjq7Sou5eHAqFw9KZenOLDbl12wP+Co7k24zH2Do2//i/M/extdG8xZ//m4+y/Zk0zUikuuXfLR/iNUTHg/qQ8vzAIjsPwkJN3NPnsgkALwlZl4jcsAUwvvUH1ary6fK4izz+ucPjSU23PxTeEQY1NVM7n+1YwsxoWGkxMQxKqEnJVWVLNud3ehrHgxVHeDf6TnHpqnqFOf+ElUdq6qxqhqnqudUd3rO+YX+nZ5zrERVE1V1rvNYVDXG7zWObU6nZ1lWHXZVZ5volL+lVzeaZb/f795OekGec2wlANuK8rlowX+Y0PcQXh57HvO2bWL6jwtb/T0X7djCixtWMKn/oVw48DB2l5Vwx/emPwjveSIA5Vnm01bXsS8T0sXs/wrvZQp1lm81BS8j+59LeO9Tmvx+AkSFmg9zxRW+eue9Ph9LcrZSXFVJwisP8pcfvtgfp2VZwagZ6cpsZpdm6XQdX35FGe9nrqdvTBxr8nazo7SQxMhoXt24ipLKCi6YP4uEyGhO7d2fNXm7mHbYMdz30yLmZB78xHGF18vvF3/IofEJDI1PYEtRPr8ZPILn1v/EkpytRPQ3+foKvrud0vS3qdzzE1phsm1F9B4LIRGUZ31KwQ93UZn7M96CpleBiQiTh5jFMC+vLuD7HaXsKKpiU14Fc9OLWJm7k4LKclJi4jgyoRdHJvQiVDz7V3lalhWE2qjjE5HuIvKeiBSLyBYRubSRdiIiD4jIXuf2gPgttxWRUSLyo4iUOF9HNfQ6nU2nW9wyd0saHhEeHX0mU5w5tgdXfM30n75ixd6dbL9pOig85gxvKtDbI0y77zkmZ9XPENGQuvNt5d5Knh8SRffcpfSpyiEkqgcVPTxcHx1L78rtRA2+hLL0NynP+pR9X9bsxZHwrngiE+hy1D0U/vBXilc+QPHKB2rOy4Hn4s4eGEN8uIfFWaU8vXwfFT6IDhVGJkWwqTgTgJdOncwZKebq8rS5M1mSs5Uqn49Qu3rLsoKL0JZDmU8BFZg8vaOAj0RkparWzZRxLSbhxBGYt9v5wGbg3yISjkln+BjwNHAdMEdEhqhqp57c7nQd39QhI5k6xGS78u+g/jzKDClmb8uqdS57e/P2tlXtW0/p5ncJiepB5qp5qLeM0s3v4ivdRXi3EQz86Q+Edk8l9uQv0coC2PwOw/bMJSK8COk7mm7j51C2+R1KM94BlPCkY4gaegUAMSP/RGXMAMpWPQJlOYTFDyVm2NWUJIzjmWV57Cyp4pVvNpIcG8qmvAoqfUpUqIcB8WGMSYnm4uNNSqSt27II8ZgPU08tL+I3fZI4kiyq8n2or4IrUhLoE1bFzuJ8krt0a5tfuGVZAULaZIO6iMQAU4ARTjL6JSLyAXA5ZiW3vyuAR1Q1y3nuI8A1wL8x+35DgcfUrJJ8QkRuw+zjbXoVjos6XcdXWV5IcV4GPoUNa7/Dg5fivevBV4EnJoXSikoqqmpSgRUV7QZvJZ6wKEqrKskp2E0VocRFxVJaWUoYPrqGR+ING8CquP+itDSXXvt8hBCOyhCi4kaRHNeLLlM2UL7jK/JXP4UnNBrEQ1jfcwlJOpatBWVsLSxBI08n4agJlHmriAjxEO+LIqa8ghV7dpPvOZrr7ilAI+KZ+d4M+sXGM0jKuDjxB3YWFlFcnsLe8jAqKwrpIuV0S+xHjsayITebZxfNJSkqlk82Lyc+1MOw+ASuHJzKhV1D8PpCOPGkk4gMg7nvv8Z5cYmEagUZe3dRWF7J5HETUG8VC5Z+SWSI0D2yCzFR0Qf4DVuWFbjaZA5vKFClqml+x1YCpzbQNtU5598u1e/cKq29NWCVc7xTd3yt2c5gWe1ORHYD7TmxmUjtLDSBxsbvrvaMv7+qJlU/EJFPqJ0qsDGRmPJf1Z5V1Wf9Xudk4D+q2svv2DXAZao61v+FRMQLpKrqeufxECANsz7kLufcJX7tXwc2quq9zf0h3dDprvgsy5//f/z2ICLLAnnPk43fXR0Zv6pOaKOXKsKkF/QXBzS0abNu2zigSFVVRFryOp2KXSFhWZYVXNKAUOfqrdoRQEMlYNY45xpqtwYY6b/KE1OOrP1LybSS7fgsy7KCiJOHdzZwn4jEiMhJwGTg1QaavwLcIiLJItIHk7LwZefcQsAL3CQiESJyg3P8i/aMvy3Yjs8Kds823aRTs/G7K1Djvx6IwpQhexOYpqprRORkZwiz2gxgLvAzsBr4yDmGs2XhPOC3wD7gKkyll069lQHs4hbLsiwryNgrPsuyLCuo2I7PsizLCiq247OCipN7cJA0lU/OsqxfLNvxWUHFyTLxM+1YT9JqGRH5yO0YWiPQ4w9GtuOzgtFyTNqmgCYiHhHp7XYcbWCJ2wG0UqDHH3Tsqk4r6IjI34CpmP1I2/C7+lPVF10Kq9lEpCsmG/6FQKWqxojIJOA4Vb3L3egsq/OzHZ8VdETky0ZOqaqO69BgDoKIvAXkAfcBa1W1m4gkAd+o6pADP9tdIhIPHArE+h9X1U6/6RkCP37LsB2fZQUYJ3F3H1WtFJFcVe3uHM9X1XiXw2uUiPwOUweuCCjxO6WqOsiVoFog0OO3atgk1VZQEpEE4Gygl6o+5KRj8lTXHevk8jFZ+ndUHxCRfv6PO6n7gQtVdZ7bgRykQI/fctjFLVbQEZFTgQ3AZcA9zuEhwDOuBdUyzwPvishpgEdERgMzMcVBO7NQ4DO3g2iFQI/fctihTivoiMhy4DZV/VxE8pw5skhgi6r2dDu+pjjZ8G8CrgP6A1sx+RMf1078H1pEbgG6AP+rqj6342mpQI/fqmE7PivoVHd2zv1cVe0uIh5gt6omuBzeL4qI+K+aFaAXUAHs9W+nqv06OLRmCfT4rYbZOT4rGK0VkTNV9VO/Y2dgNrZ3eiKyEngNeDMA5iSnuh1AKwV6/FYD7BWfFXRE5ATgQ0yJlV9jao5NBCar6g9uxtYcInI+8BvM4pwfgTeA/6hqrquBWVaAsB2fFZScVZxTMXNk24DXAuDqqRYR6QJcgOkETwY+V9VJ7kbVOBEJB34HjKL+PrjfuhFTSwR6/FYNO9RpBR0RicDM5z3odyxMRCJUtdzF0FpEVQtF5A1MEdBwzBVgZzYTOAJT2HSny7EcjECP33LYKz4r6IjIIuDPqvqt37ETgH+o6ljXAmsmZ1XnOOBS4HxgC2a48y1V3eZmbAciInnAQFXd53YsByPQ47dq2Cs+KxgdDnxX59j3mE/zgSAbkz3kLeAkVV3ncjzNtRWIcDuIVgj0+C2H7fisYJQP9ARy/I71BIrdCafFJqvq924HcRBeAeaIyOPUGSoMkFyXgR6/5bBDnVbQEZFHgCMxm8AzgMHAP4GfVfUWN2NrjIgMUNVM536jeSFVNaPDgmohEdncyKmAyHUZ6PFbNWzHZwUdJ0vLI8CVmKGrMuAlTDaXMjdja4yIFKpqF+e+D7OpWuo0U1W1leUtqwm247OClrNIJBHY05lTff2SiEgYcAKmusTbIhIDoKoBMcwc6PFbhk1SbQUlp67asZiFLqeJyDgR6fS1+ABE5IlGjj/W0bG0hIgcDqQBzwEvOIdPBTp98V8I/PitGvaKzwo6gV5XTUQKVDWugeN7O3OuURFZAsxQ1Vf9koPHAGmqmux2fE0J9PitGnZVpxWMArKumohc5dwN9btfbRCwp4NDaqlUTI5RcBI/q2qxiES5F1KLBHr8lsN2fFYwCtS6apc7X8P97oN5E94JXNHhEbVMJnA0sKz6gIgcB2xyK6AWyiSw47cctuOzgtEDwF0iElB11VT1NAAR+Zuq3uV2PAfhbuAjEfk3EC4idwK/B65xN6xmC/T4LYed47OCjlNjLWDrqonIeCBTVdP8jh0K9FPV+e5F1jQRORLTUVQnB39OVX90N6rmC/T4LcN2fFbQEZFTGzunql91ZCwHQ0Q2Aqeo6g6/Y32Ahao61L3ILCsw2I7PsgKMiOSranydYwLkN7Ta07Ks2uw+PivoiEiEiNwvIhkiku8cGy8iN7gdWzNlNLDncCzQWEoty7L82Cs+K+iIyNNAMvAPYJ6qdhWRZOAzVU11N7qmichkTG24F4B0TK7RK4ErVXWOm7FZViCwHZ8VdERkB3CIswcrV1W7O8f3qWpXl8NrFmcZ/VVAX8wiixdU9Qd3ozowERmuqmsbOH6mqn7qRkxWcLLbGaxgVEGdv30RSaLOCs/OzClLFGiliT4UkdNVdf+QrIhMBJ4FersXVvOIyKs4G9frKAeygPdVdWXHRmUdDDvHZwWj/wAzRWQggIj0Bp7EFHYNCCIySkRuFJHpInJf9c3tuJrwJ+BT5/eNiFwAzADOdTWq5ssHJmOqYmQ5XycBXuAwYKmI/Na98Kzmsld8VjD6C2YT+89ANLARk3h4uptBNZeIXAs8isk+cxYwDxgPdOr5PVV9V0TigPki8hRmQ/gEVV3lcmjNNRQ4W1W/rj4gIqOB+1T1VyIyAXgMU7DW6sTsHJ8V1JwhzoAqSyQimzALWRb7JUs+C7hEVTtV2jIRaWhU6WbgNkxnvQYgEDLoOCuAE1S1yu9YGObvJ97ZUlKoqrGuBWk1i+34rKBwoKrl/jpzBfNq/tUZRGQvkKSqPv+FOp2FX9HcWoedr9XFdAOigK6IfAV8C/yPqpY5BY3vBU5U1VOcv7GFgZD9J9jZoU4rWGyi4arl/hTo9G/AQJaIDFDVTEx9uMkisgezaKezGeh2AG3oCuANoEBEcoHumITVlznnuwPXuxSb1QL2is+yAoxTT3Cnqs5zhjjfwVRsuElVn3E1uCAgIn2BPsAOVd3qdjxWy9mOz7ICnIiEA+GqWuR2LE0RkUmYquWJ+F19q2rArK8wkkUAAAV2SURBVIYUkR5ArXm8QBgit2rYoU4rKIjIJ6o6wbm/mIb3Y6Gqp3RoYK1Q9w1YRHp05jdgEfkfTBmft4CLMFsZLgXedjOu5nJWbb5A/T2HgTJEbjnsFZ8VFETkUlV9w7nf6MpHVZ3ZcVEdHL834F7UnrPs1ItERGQLcI6qrq7OkuNkoLlLVSe5HV9TRCQdeAiYqaqlbsdjHTzb8VlBQ0SOBspVdbXzuAdm31UqZrXerQEyXBiQb8D+VSVEZBeQrKqVDVWb6IycBS0JgbT1xWqYzdxiBZPHMFdJ1Z4FhjhfU4EH3QjqIHQDZgRSp+dIF5HqJOCrgWkicjmQ52JMLfECJhm4FeDsFZ8VNJwl/8mqWi4iXYHdQKqqpjkr9b5R1b7uRtk0EXkIWKeqL7odS0uIyNlAkaouEpHjgdcxc5TXq+psd6NrmjM3fBywBcjxPxdIc8OW7fisICIi+4BuqqrOPNmz/puNRaRQVbu4F2HzOG/AxwOZ2DfgDhPoc8NWDbuq0womazCrCWcBlwALqk849fjyXYqrpZ53bgFHRIZh/g16quoNInIoEBEI+Tpt5/bLYef4rGByOzDDWaRwDiZRdbWLga8bfFYnISLjnMrr2w5w67RE5CJgEaYIcPW+vS7AP10LqoVE5EoR+UJENjhf7ZxfALJDnVZQEZEumCz7aapa6Hf8UEyC4WzXgmuCiGxuoomqarNykrpBRNZhEmmv9EuuHQZkq2qS2/E1RUT+iumwH8HM8/XHJNx+TVXvdzM2q2Vsx2dZVodwEmonOnOsuaraXURCMR1fD7fja4rzwWOsqm7xO9YfWKSq/d2LzGopO9RpWVZH+RG4vM6xSwicSvIxmJXA/vYCUS7EYrWCveKzLKtDOAtbPgM2AycACzHDzuNVdaOLoTWLiLyCmZO8A9iKGeq8HyhR1bodutWJ2Y7PsqwOIyLRwLmYTmMb8GEgZMsBcKrHP4lZCBUKVGJWCN+kqvvcjM1qGdvxWZbVrkTk15h5sJwmG3dSTiX5sZiVv5WY6hJ7AqFyvFWf7fgsy2pXIpIGDAbSMdsZvsJ0hFsO+MROJlASHFhNs4tbLMtqV6o6FLN3769AKXArJm/nFhF5VUSudjXA5lskIie4HYTVevaKz7KsDici3YBrgFuApM5cTqmaiDwN/AaYg5mf3P/mqar3uBWX1XI2ZZllWe1ORAQYBZzi3E4EsjGLQxa7GFpLRAHvO/dT3AzEah17xWdZVrsSkY+AI4ENwBLn9o1/5hzL6kh2js+yrPY2FCjH7N9LBzYFYqfn5Hht6Piujo7Fah17xWdZVrsTkV7AyZhhzpMx2wG+xgxzLlHVFS6G1ywNrep0co3mqGqCS2FZB8F2fJZldbhAWtzi1D9UYDSwtM7pFGCNqk7s8MCsg2YXt1iW1e4aWNwyBugKLAM6eyX55wEBjgVe8DuuwE7gCzeCsg6eveKzLKtdicjHmKulcOA7nA3swFJVLXMztpYQkWGqut7tOKzWs1d8lmW1t0XA34AfVLXS7WBaSkSOBspVdbXzOAl4DBiBGfq8LVDyjVqGveKzLMs6AGeOb7qqLnAezwH6AC9jNrSvUtXr3YvQainb8VmWZR2AiOwBklW1XES6AruAEaqaJiJ9MXsS+7obpdUSdh+fZVnWgYUCFc79EzDbF9IAVHUbZpGOFUBsx2dZlnVga4CLnPuXAAuqT4hIMpDvRlDWwbNDnZZlWQcgImOAuZjtC15gjKpucM7dAhyvqhe7GKLVQrbjsyzLaoKIdMGkXkvzT7cmIocChaqa7VpwVovZjs+yLMsKKnaOz7IsywoqtuOzLMuygort+CzLsqygYjs+y7IsK6jYjs+yLMsKKv8PnqauUMfVIE0AAAAASUVORK5CYII=\n", "text/plain": [ "<Figure size 432x288 with 11 Axes>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Calculate motif frequency in each class\n", "occupied_cutoff = 0.5\n", "motif_freq_df = wt_occupancy_grouper.apply(lambda x: (x > occupied_cutoff).sum() / len(x))\n", "# Sort by the feature importance in the logistic model\n", "feature_importance = occ_clf.coef_[0]\n", "feature_order = feature_importance.argsort()\n", "motif_freq_df = motif_freq_df.iloc[:, feature_order]\n", "\n", "# Make the fig\n", "fig, ax_list = plt.subplots(nrows=8, ncols=2, figsize=(6, 4), gridspec_kw=dict(width_ratios=[1, 2]))\n", "gs = ax_list[0, 0].get_gridspec()\n", "for ax in ax_list[:, 1]:\n", " ax.remove()\n", " \n", "axbig = fig.add_subplot(gs[:, 1])\n", "\n", "ax = axbig\n", "vmax = 0.25\n", "thresh = vmax / 2\n", "motif_freq_no_crx_df = motif_freq_df.drop(columns=\"CRX\")\n", "heatmap = ax.imshow(motif_freq_no_crx_df.T, aspect=\"auto\", vmin=0, vmax=vmax, cmap=\"Reds\")\n", "ax.set_xticks(np.arange(len(wt_activity_names_oneline)))\n", "ax.set_xticklabels(wt_activity_names_oneline, rotation=90)\n", "ax.set_yticks(np.arange(len(motif_freq_no_crx_df.columns)))\n", "ax.set_yticklabels(motif_freq_no_crx_df.columns)\n", "plot_utils.annotate_heatmap(ax, motif_freq_no_crx_df, thresh)\n", "\n", "# Add the logos\n", "for cax, tf in zip(ax_list[1:, 0], motif_freq_no_crx_df.columns):\n", " pwm = logomaker.transform_matrix(pwms[tf], from_type=\"probability\", to_type=\"information\")\n", " logomaker.Logo(pwm, ax=cax, color_scheme=\"colorblind_safe\", show_spines=False)\n", " # Right-align the logos\n", " cax.set_xlim(left=motif_len[tf] - motif_len.max() - 0.5)\n", " cax.set_ylim(top=2)\n", " cax.set_xticks([])\n", " cax.set_yticks([])\n", "\n", "# Add a colorbar\n", "divider = make_axes_locatable(ax)\n", "cax = divider.append_axes(\"right\", size=\"5%\", pad=\"2%\")\n", "colorbar = fig.colorbar(heatmap, cax=cax, label=\"Frequency of motif\")\n", "ticks = cax.get_yticks()\n", "ticks = [f\"{i:.2f}\" for i in ticks]\n", "ticks[-1] = r\"$\\geq$\" + ticks[-1]\n", "cax.set_yticklabels(ticks)\n", "\n", "# Add CRX\n", "cax = divider.append_axes(\"top\", size=\"14%\", pad=\"2%\")\n", "heatmap = cax.imshow(motif_freq_df[\"CRX\"].to_frame().T, aspect=\"auto\", vmin=0, vmax=vmax, cmap=\"Reds\")\n", "cax.xaxis.tick_top()\n", "cax.set_xticks(ax.get_xticks())\n", "cax.set_xlim(ax.get_xlim())\n", "cax.set_xticklabels(wt_activity_count, fontsize=10, rotation=45)\n", "cax.set_yticks([0])\n", "cax.set_yticklabels([\"CRX\"])\n", "plot_utils.annotate_heatmap(cax, motif_freq_df[\"CRX\"].to_frame(), thresh)\n", "\n", "# Add CRX logo\n", "cax = ax_list[0, 0]\n", "pwm = logomaker.transform_matrix(pwms[\"CRX\"], from_type=\"probability\", to_type=\"information\")\n", "logomaker.Logo(pwm, ax=cax, color_scheme=\"colorblind_safe\", show_spines=False)\n", "# Right-align the logos\n", "cax.set_xlim(left=motif_len[tf] - motif_len.max() - 0.5)\n", "cax.set_ylim(top=2)\n", "cax.set_xticks([])\n", "cax.set_yticks([])\n", "\n", "plot_utils.add_letter(cax, 0, 1.03, \"c\")\n", "print(\"Figure 2c\")\n", "fig.tight_layout(pad=0)\n", "display(fig)\n", "plt.close()" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "caption": "(**d**) Frequency of co-occurring TF motifs in strong enhancers. Lower triangle is expected co-occurrence if motifs are independent. (**e**) Frequency of activity classes, colored as in (**b**), for sequences in CRX, NRL, and/or MEF2D ChIP-seq peaks. (**f**) Frequency of TF ChIP-seq peaks in activity classes. TFs in (**c**) are sorted by feature importance of the logistic regression model in (**a**).", "id": "fig2def", "label": "Figure 2d, e, and f" }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th>group_name_WT</th>\n", " <th>Silencer</th>\n", " <th>Inactive</th>\n", " <th>Weak enhancer</th>\n", " <th>Strong enhancer</th>\n", " </tr>\n", " <tr>\n", " <th>binding_group</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>No binding</th>\n", " <td>0.221493</td>\n", " <td>0.286300</td>\n", " <td>0.331419</td>\n", " <td>0.160788</td>\n", " </tr>\n", " <tr>\n", " <th>CRX only</th>\n", " <td>0.203553</td>\n", " <td>0.222276</td>\n", " <td>0.346615</td>\n", " <td>0.227556</td>\n", " </tr>\n", " <tr>\n", " <th>CRX+NRL</th>\n", " <td>0.192560</td>\n", " <td>0.115974</td>\n", " <td>0.238512</td>\n", " <td>0.452954</td>\n", " </tr>\n", " <tr>\n", " <th>CRX+MEF2D</th>\n", " <td>0.145000</td>\n", " <td>0.165000</td>\n", " <td>0.280000</td>\n", " <td>0.410000</td>\n", " </tr>\n", " <tr>\n", " <th>All three</th>\n", " <td>0.099338</td>\n", " <td>0.105960</td>\n", " <td>0.284768</td>\n", " <td>0.509934</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ "group_name_WT Silencer Inactive Weak enhancer Strong enhancer\n", "binding_group \n", "No binding 0.221493 0.286300 0.331419 0.160788\n", "CRX only 0.203553 0.222276 0.346615 0.227556\n", "CRX+NRL 0.192560 0.115974 0.238512 0.452954\n", "CRX+MEF2D 0.145000 0.165000 0.280000 0.410000\n", "All three 0.099338 0.105960 0.284768 0.509934" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Figure 2, panels D-F\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "<Figure size 576x288 with 7 Axes>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Setup figure\n", "fig, ax_list = plt.subplots(nrows=2, ncols=2, figsize=(8, 4), gridspec_kw=dict(height_ratios=[3, 2]))\n", "ax2d = ax_list[0, 0]\n", "ax2f = ax_list[1, 0]\n", "for ax in ax_list[:, 1]:\n", " ax.remove()\n", "\n", "ax2e = fig.add_subplot(ax2d.get_gridspec()[:, 1])\n", "\n", "# Calculate co-occurrance of motifs in strong enhancers\n", "strong_enh_coocc_df = wt_occupancy_grouper.get_group(\"Strong enhancer\")[[\"RAX\", \"NRL\", \"MAZ\", \"NDF1\", \"RORB\"]]\n", "strong_enh_coocc_df = (strong_enh_coocc_df > occupied_cutoff).astype(int)\n", "strong_enh_coocc_df = strong_enh_coocc_df.T.dot(strong_enh_coocc_df) / len(strong_enh_coocc_df)\n", "# Fill in lower triangle with the expected values\n", "for row in range(len(strong_enh_coocc_df)):\n", " for col in range(row + 1, len(strong_enh_coocc_df)):\n", " strong_enh_coocc_df.iloc[row, col] = strong_enh_coocc_df.iloc[row, row] * strong_enh_coocc_df.iloc[col, col]\n", " \n", "# 2d: Make the heatmap\n", "ax = ax2d\n", "vmax = 0.25\n", "thresh = vmax / 2\n", "heatmap = ax.imshow(strong_enh_coocc_df, aspect=\"auto\", cmap=\"Reds\", vmax=vmax, vmin=0)\n", "ax.set_title(\"Strong enhancers\")\n", "ax.set_xticks(np.arange(len(strong_enh_coocc_df.columns)))\n", "ax.set_xticklabels(strong_enh_coocc_df.columns)\n", "ax.set_yticks(np.arange(len(strong_enh_coocc_df.columns)))\n", "ax.set_yticklabels(strong_enh_coocc_df.columns)\n", "plot_utils.annotate_heatmap(ax, strong_enh_coocc_df, thresh, adjust_lower_triangle=True)\n", "\n", "# Add colorbar\n", "divider = make_axes_locatable(ax)\n", "cax = divider.append_axes(\"right\", size=\"5%\", pad=\"2%\")\n", "colorbar = fig.colorbar(heatmap, cax=cax, label=\"Freq. motifs\\nco-occur\", ticks=[0, round(thresh, 2), vmax])\n", "plot_utils.add_letter(ax, -0.25, 1.03, \"d\")\n", "\n", "# Calculate activity classes for different binding combos\n", "binding_combos_activity_freq = activity_measured_wt_df.groupby(\"binding_group\")[\"group_name_WT\"].value_counts().unstack()\n", "binding_combos_activity_freq = binding_combos_activity_freq[class_sort_order]\n", "# Ignore cases where there is NRL or MEF2D but not CRX\n", "binding_combos_activity_freq = binding_combos_activity_freq.loc[[\"No binding\", \"CRX only\", \"CRX+NRL\", \"CRX+MEF2D\", \"All three\"]]\n", "binding_combos_activity_freq = binding_combos_activity_freq.astype(int)\n", "\n", "# Generate names then normalize data\n", "binding_combos_names = binding_combos_activity_freq.index.values\n", "binding_combos_count = [j.sum() for i, j in binding_combos_activity_freq.iterrows()]\n", "binding_combos_activity_freq = binding_combos_activity_freq.div(binding_combos_activity_freq.sum(axis=1), axis=0)\n", "display(binding_combos_activity_freq)\n", "\n", "# 2e: make plot\n", "ax = ax2e\n", "fig = plot_utils.stacked_bar_plots(binding_combos_activity_freq, \"Fraction of group\", binding_combos_names, color_mapping, figax=(fig, ax), vert=True)\n", "ax.set_yticks(np.linspace(0, 1, 6))\n", "plot_utils.rotate_ticks(ax.get_xticklabels())\n", "\n", "# Add the n\n", "ax_twin = ax.twiny()\n", "ax_twin.set_xticks(ax.get_xticks())\n", "ax_twin.set_xlim(ax.get_xlim())\n", "ax_twin.set_xticklabels(binding_combos_count, fontsize=10, rotation=45)\n", "plot_utils.add_letter(ax, -0.25, 1.03, \"e\")\n", "\n", "# Frequency each class is bound by each TF\n", "group_bound_freqs = activity_measured_wt_df.groupby(\"group_name_WT\")[[\"crx_bound\", \"nrl_bound\", \"mef2d_bound\"]].apply(lambda x: x.sum() / len(x))\n", "group_bound_freqs.columns = group_bound_freqs.columns.str.split(\"_\").str[0].str.upper()\n", "\n", "# 2f: Make heatmakt\n", "vmax = 1\n", "thresh = vmax / 2\n", "ax = ax2f\n", "heatmap = ax.imshow(group_bound_freqs.T, aspect=\"auto\", cmap=\"Reds\", vmax=vmax, vmin=0)\n", "ax.set_xticks(np.arange(len(wt_activity_names_oneline)))\n", "ax.set_xticklabels(wt_activity_names_oneline, rotation=90)\n", "ax.set_yticks(np.arange(len(group_bound_freqs.columns)))\n", "ax.set_yticklabels(group_bound_freqs.columns)\n", "plot_utils.annotate_heatmap(ax, group_bound_freqs, thresh)\n", "\n", "# Add colorbar\n", "divider = make_axes_locatable(ax)\n", "cax = divider.append_axes(\"right\", size=\"5%\", pad=\"2%\")\n", "colorbar = fig.colorbar(heatmap, cax=cax, label=\"Fraction\\nbound\")\n", "plot_utils.add_letter(ax, -0.25, 1.03, \"f\")\n", "\n", "# Add ticks above to show the n\n", "ax_twin = ax.twiny()\n", "ax_twin.set_axes_locator(ax.get_axes_locator())\n", "ax_twin.set_xticks(ax.get_xticks())\n", "ax_twin.set_xlim(ax.get_xlim())\n", "ax_twin.set_xticklabels(wt_activity_count, fontsize=10, rotation=45)\n", "\n", "print(\"Figure 2, panels D-F\")\n", "fig.tight_layout(pad=0)\n", "display(fig)\n", "plt.close()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Lineage-defining TF motifs differentiate strong enhancers from silencers\n", "\n", "We performed a de novo motif enrichment analysis to identify motifs that distinguish strong enhancers from silencers and found several differentially enriched motifs matching known TFs. For motifs that matched multiple TFs, we selected one representative TF for downstream analysis, since TFs from the same family have PWMs that are too similar to meaningfully distinguish between motifs for these TFs ([Figure 2—figure supplement 2](#fig2s2), Materials and methods). Strong enhancers are enriched for several motif families that include TFs that interact with CRX or are important for photoreceptor development: NeuroD1/NDF1 (E-box-binding bHLH) [@bib59], RORB (nuclear receptor) [@bib36; @bib79], MAZ or Sp4 (C2H2 zinc finger) [@bib51], and NRL (bZIP) [@bib55; @bib56]. Sp4 physically interacts with CRX in the retina [@bib51], but we chose to represent the zinc finger motif with MAZ because it has a higher quality score in the HOCOMOCO database [@bib46]. Silencers were enriched for a motif that resembles a partial K50 homeodomain motif but instead matches the zinc finger TF GFI1, a member of the Snail repressor family [@bib8] expressed in developing retinal ganglion cells [@bib88]. Therefore, while strong enhancers and silencers are not distinguished by their CRX motif content, strong enhancers are uniquely enriched for several lineage-defining TFs.\n", "\n", "To quantify how well these TF motifs differentiate strong enhancers from silencers, we trained two different classification models with fivefold cross-validation. First, we trained a 6-mer support vector machine (SVM) [@bib19] and achieved an AUROC of 0.781 ± 0.013 and AUPR of 0.812 ± 0.020 ([Figure 2a](#fig2) and [Figure 2—figure supplement 1](#fig2ab)). The SVM considers all 2080 non-redundant 6-mers and provides an upper bound to the predictive power of models that do not consider the exact arrangement or spacing of sequence features. We next trained a logistic regression model on the predicted occupancy for eight lineage-defining TFs ([Supplementary file 4](#supp4)) and compared it to the upper bound established by the SVM. In this model, we considered CRX, the five TFs identified in our motif enrichment analysis, and two additional TFs enriched in photoreceptor ATAC-seq peaks [@bib31]: RAX, a Q50 homeodomain TF that contrasts with CRX, a K50 homeodomain TF [@bib34] and MEF2D, a MADS box TF which co-binds with CRX [@bib2]. The logistic regression model performs nearly as well as the SVM (AUROC 0.698 ± 0.036, AUPR 0.745 ± 0.032, [Figure 2a](#fig2) and [Figure 2—figure supplement 1](#fig2ab)) despite a 260-fold reduction from 2080 to 8 features. To determine whether the logistic regression model depends specifically on the eight lineage-defining TFs, we established a null distribution by fitting 100 logistic regression models with randomly selected TFs (Materials and methods). Our logistic regression model outperforms the null distribution (one-tailed Z-test for AUROC and AUPR, p < 0.0008, [Figure 2—figure supplement 3](#fig2s3)), indicating that the performance of the model specifically requires the eight lineage-defining TFs. To determine whether the SVM identified any additional motifs that could be added to the logistic regression model, we generated de novo motifs using the SVM _k_-mer scores and found no additional motifs predictive of strong enhancers. Finally, we found that our two models perform similarly on an independent test set of CRX-targeted sequences ([@bib85]; [Figure 2—figure supplement 3](#fig2s3)). Since the logistic regression model performs near the upper bound established by the SVM and depends specifically on the eight selected motifs, we conclude that these motifs comprise nearly all of the sequence features captured by the SVM that distinguish strong enhancers from silencers among CRX-targeted sequences." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "figure: Figure 2—figure supplement 2.\n", ":::\n", "![](elife-67403.xml.media/fig2-figsupp2.jpg)\n", "\n", "### Results from de novo motif analysis.\n", "\n", "Motifs enriched in strong enhancers (**a**) and silencers (**b**). Bottom, de novo motif identified with DREME; top, matched known motif identified with TOMTOM.\n", ":::\n", "{#fig2s2}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "figure: Figure 2—figure supplement 3.\n", ":::\n", "### Additional validation of the eight transcription factors (TFs) predicted occupancy logistic regression model.\n", ":::\n", "{#fig2s3}" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "caption": "Predictions of the 6-mer support vector machine (SVM) (black) and eight TFs predicted occupancy logistic regression model (orange) on an independent test set. (**a**) Receiver operating characteristic, (**b**) precision recall curve. Dashed black line represents chance in both panels.", "id": "fig2s3ab", "label": "Figure 2—figure supplement 3 a and b." }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Only panels A and B are shown here. Generating the data for panels C and D will take approximately 50 minutes. If you are interested in generating these panels, the code is in the next cell, but commented out.\n", "Computing predicted occupancy of all TFs on the test set.\n", "Done computing predicted occupancy.\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>CRX</th>\n", " <th>GFI1</th>\n", " <th>MAZ</th>\n", " <th>MEF2D</th>\n", " <th>NDF1</th>\n", " <th>NRL</th>\n", " <th>RORB</th>\n", " <th>RAX</th>\n", " </tr>\n", " <tr>\n", " <th>label</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>chr1_100559800_SCRUBR</th>\n", " <td>0.274096</td>\n", " <td>2.545296e-13</td>\n", " <td>1.630613e-11</td>\n", " <td>4.707551e-14</td>\n", " <td>1.017487e-07</td>\n", " <td>0.000854</td>\n", " <td>4.694361e-05</td>\n", " <td>0.008889</td>\n", " </tr>\n", " <tr>\n", " <th>chr1_100559800_UBR</th>\n", " <td>1.178397</td>\n", " <td>5.862032e-11</td>\n", " <td>1.102815e-06</td>\n", " <td>1.221394e-10</td>\n", " <td>1.066875e-03</td>\n", " <td>0.000541</td>\n", " <td>8.777171e-07</td>\n", " <td>0.001608</td>\n", " </tr>\n", " <tr>\n", " <th>chr1_100750470_UBR</th>\n", " <td>2.430898</td>\n", " <td>8.232504e-07</td>\n", " <td>5.564299e-11</td>\n", " <td>2.960941e-10</td>\n", " <td>1.272582e-02</td>\n", " <td>0.969272</td>\n", " <td>1.295348e-06</td>\n", " <td>0.001267</td>\n", " </tr>\n", " <tr>\n", " <th>chr1_108920170_UBR</th>\n", " <td>2.072197</td>\n", " <td>7.323860e-03</td>\n", " <td>6.147587e-16</td>\n", " <td>4.758899e-09</td>\n", " <td>2.658399e-10</td>\n", " <td>0.808744</td>\n", " <td>5.559077e-03</td>\n", " <td>0.003341</td>\n", " </tr>\n", " <tr>\n", " <th>chr1_11177090_SCRUBR</th>\n", " <td>3.214338</td>\n", " <td>4.034044e-04</td>\n", " <td>4.444271e-14</td>\n", " <td>2.389581e-07</td>\n", " <td>3.627830e-10</td>\n", " <td>0.000005</td>\n", " <td>1.550753e-03</td>\n", " <td>2.118118</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " CRX GFI1 MAZ MEF2D \\\n", "label \n", "chr1_100559800_SCRUBR 0.274096 2.545296e-13 1.630613e-11 4.707551e-14 \n", "chr1_100559800_UBR 1.178397 5.862032e-11 1.102815e-06 1.221394e-10 \n", "chr1_100750470_UBR 2.430898 8.232504e-07 5.564299e-11 2.960941e-10 \n", "chr1_108920170_UBR 2.072197 7.323860e-03 6.147587e-16 4.758899e-09 \n", "chr1_11177090_SCRUBR 3.214338 4.034044e-04 4.444271e-14 2.389581e-07 \n", "\n", " NDF1 NRL RORB RAX \n", "label \n", "chr1_100559800_SCRUBR 1.017487e-07 0.000854 4.694361e-05 0.008889 \n", "chr1_100559800_UBR 1.066875e-03 0.000541 8.777171e-07 0.001608 \n", "chr1_100750470_UBR 1.272582e-02 0.969272 1.295348e-06 0.001267 \n", "chr1_108920170_UBR 2.658399e-10 0.808744 5.559077e-03 0.003341 \n", "chr1_11177090_SCRUBR 3.627830e-10 0.000005 1.550753e-03 2.118118 " ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Making predictions on the test set with the SVM and 8 TF logistic regression model.\n", "Model performance on White 2013 test set:\n", "SVM\tAUROC = 0.800\tAUPR = 0.821\n", "8 TFs\tAUROC = 0.662\tAUPR = 0.714\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "<Figure size 576x288 with 2 Axes>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "print(\"Only panels A and B are shown here. Generating the data for panels C and D will take approximately 50 minutes. If you are interested in generating these panels, the code is in the next cell, but commented out.\")\n", "white_data_dir = os.path.join(\"Data\", \"Downloaded\", \"CrxMpraLibraries\")\n", "white_seqs = pd.read_csv(os.path.join(white_data_dir, \"white2013Sequences.txt\"), sep=\"\\t\", header=None, usecols=[0, 8], index_col=0, squeeze=True, names=[\"label\", \"sequence\"])\n", "# Only keep barcode1 sequences since barcode info isn't needed\n", "bc_tag = \"_barcode1\"\n", "white_seqs = white_seqs[white_seqs.index.str.contains(bc_tag)]\n", "# Trim off the barcode ID\n", "white_seqs = white_seqs.rename(lambda x: x[1:-len(bc_tag)])\n", "# Only keep the 84 bp of the sequence that corresponds to the library\n", "seq_len = 84\n", "seq_start = len(\"TAGCGTCTGTCCGTGAATTC\") + 1\n", "white_seqs = white_seqs.str[seq_start:seq_start+seq_len]\n", "# Function to correct off by one error in labeling\n", "def correct_label(name):\n", " chrom, pos, group = name.split(\"_\")\n", " pos = int(pos) + 1\n", " return \"_\".join([chrom, str(pos), group])\n", "\n", "white_activity_df = pd.read_csv(os.path.join(white_data_dir, \"white2013Activity.txt\"), sep=\"\\t\", index_col=0, usecols=[0, 1, 2, 3], names=[\"label\", \"class\", \"expression\", \"expression_SEM\"], header=0)\n", "# Correct the off by one error of the labels\n", "white_activity_df = white_activity_df.rename(correct_label)\n", "white_activity_df[\"expression_log2\"] = np.log2(white_activity_df[\"expression\"])\n", "\n", "white_measured_seqs = white_seqs[white_activity_df.index]\n", "\n", "print(\"Computing predicted occupancy of all TFs on the test set.\")\n", "white_occupancy_df = predicted_occupancy.all_seq_total_occupancy(white_measured_seqs, ewms, mu, convert_ewm=False)\n", "print(\"Done computing predicted occupancy.\")\n", "display(white_occupancy_df.head())\n", "\n", "# Define cutoffs\n", "scrambled_mask = white_activity_df[\"class\"].str.contains(\"SCR\")\n", "strong_cutoff = white_activity_df.loc[scrambled_mask, \"expression_log2\"].quantile(0.95)\n", "white_scrambled_mean = white_activity_df.loc[scrambled_mask, \"expression_log2\"].mean()\n", "\n", "# Pull out bound sequences\n", "bound_mask = white_activity_df[\"class\"].str.match(\"CBR(M|NO)$\")\n", "bound_activity_df = white_activity_df[bound_mask].copy()\n", "bound_occupancy_df = white_occupancy_df[bound_mask]\n", "\n", "# Pull out relevant sequences\n", "white_strong_mask = bound_activity_df[\"expression_log2\"] > strong_cutoff\n", "white_silencer_mask = bound_activity_df[\"expression_log2\"] < (white_scrambled_mean - 1)\n", "white_modeling_mask = white_strong_mask | white_silencer_mask\n", "white_labels = white_strong_mask[white_modeling_mask]\n", "\n", "# Make predictions\n", "print(\"Making predictions on the test set with the SVM and 8 TF logistic regression model.\")\n", "# Write sequences to file for the SVM\n", "white_modeling_seqs = white_seqs[bound_activity_df.index][white_modeling_mask]\n", "white_modeling_fasta = os.path.join(svm_dir, \"white2013TestSet.fasta\")\n", "fasta_seq_parse_manip.write_fasta(white_modeling_seqs, white_modeling_fasta)\n", "\n", "# SVM\n", "svm_white_tpr, svm_white_prec, svm_white_scores, svm_white_f1 = gkmsvm.predict_and_eval(white_modeling_fasta, white_labels, svm_prefix, word_len, word_len, max_mis, xaxis)\n", "\n", "# Logistic model\n", "occupancy_probs = occ_clf.predict_proba(bound_occupancy_df[white_modeling_mask])\n", "occupancy_white_tpr, occupancy_white_prec, occupancy_white_f1 = modeling.calc_tpr_precision_fbeta(white_labels, occupancy_probs[:, 1], xaxis, positive_cutoff=0.5)\n", "\n", "# Setup figure\n", "fig, ax_list = plot_utils.setup_multiplot(2, n_cols=2, sharex=False, sharey=False)\n", "\n", "# Plot White 2013 test set\n", "_, white_aurocs, _, white_auprs, _ = plot_utils.roc_pr_curves(\n", " modeling_xaxis, [svm_white_tpr, occupancy_white_tpr], [svm_white_prec, occupancy_white_prec],\n", " model_names[:2], model_colors=model_colors[:2], prc_chance=svm_white_prec[-1],\n", " figax=([fig, fig], ax_list)\n", ")\n", "\n", "plot_utils.add_letter(ax_list[0], -0.15, 1.03, \"a\")\n", "plot_utils.add_letter(ax_list[1], -0.15, 1.03, \"b\")\n", "\n", "# Display model performance\n", "print(\"Model performance on White 2013 test set:\")\n", "print(f\"{model_names[0]}\\tAUROC = {white_aurocs[0]:.3f}\\tAUPR = {white_auprs[0]:.3f}\")\n", "print(f\"{model_names[1]}\\tAUROC = {white_aurocs[1]:.3f}\\tAUPR = {white_auprs[1]:.3f}\")\n", "fig.tight_layout()\n", "display(fig)\n", "plt.close()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "figure: Figure 2—figure supplement 3c and d, static.\n", ":::\n", "![](elife-67403.xml.media/fig2-figsupp3.jpg)\n", "\n", "Static version of the figure to display panels (**c**) and (**d**). Null distribution of 100 logistic regression models trained using randomly selected motifs (gray) compared to the true features (orange). Shaded area, 1 standard deviation based on fivefold cross-validation. (**c**) Receiver operating characteristic, (**d**) precision recall curve. Dashed black line represents chance in both panels.\n", ":::\n", "{#fig2s3cd_static}" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "caption": "Interactive version of panels (**c**) and (**d**). Note that this takes close to an hour to run.", "id": "fig2s3cd_int", "label": "Figure 2—figure supplement 3c and d, interactive." }, "outputs": [], "source": [ "# # Read in HOCOMOCO database\n", "# hocomoco = predicted_occupancy.read_pwm_files(os.path.join(\"Data\", \"Downloaded\", \"Pwm\", \"photoreceptorMotifsAndHOCOMOCOv11_full_MOUSE.meme\"))\n", "# hocomoco = hocomoco.apply(predicted_occupancy.ewm_from_letter_prob).apply(predicted_occupancy.ewm_to_dict)\n", "\n", "# wt_seqs = all_seqs[all_seqs.index.str.contains(\"WT\")].copy()\n", "# wt_seqs = sequence_annotation_processing.remove_mutations_from_seq_id(wt_seqs)\n", "# wt_seqs = wt_seqs[activity_df.index]\n", "# modeling_seqs = wt_seqs[silencer_modeling_mask]\n", "\n", "# niter = 100\n", "# nfeatures = len(ewms)\n", "# # Track the cross-validated mean TPR and precision for each feature set\n", "# random_tprs = []\n", "# random_precs = []\n", "# # Keep track of the features selected for each round\n", "# random_ewms = []\n", "\n", "# np.random.seed(seed)\n", "# for i in range(niter):\n", "# if i % 10 == 9:\n", "# print(f\"Iteration {i+1}\")\n", " \n", "# # Randomly sample PWMs\n", "# sample = hocomoco.sample(nfeatures)\n", "# random_ewms.append(sample.index.str.split(\"_\").str[0].values)\n", "# # Do predicted occupancy scan\n", "# features = predicted_occupancy.all_seq_total_occupancy(modeling_seqs, sample, mu, convert_ewm=False)\n", "# # Fit the model\n", "# clf = LogisticRegression(C=c_opt)\n", "# clf, tpr, prec, f1 = modeling.train_estimate_variance(clf, cv, features, labels_with_silencer, xaxis, positive_cutoff=0)\n", " \n", "# # Store the result\n", "# random_tprs.append(np.mean(tpr, axis=0))\n", "# random_precs.append(np.mean(prec, axis=0))\n", " \n", "# fig, ax_list = plot_utils.setup_multiplot(2, n_cols=2, sharex=False, sharey=False)\n", "# niter_rand = len(random_occ_tprs)\n", "# rand_tpr_plotting = [[j] for i, j in random_occ_tprs.iterrows()] + [occ_tpr_cv]\n", "# rand_prec_plotting = [[j] for i, j in random_occ_precs.iterrows()] + [occ_prec_cv]\n", "# rand_names = [\"\"] * niter_rand + [\"True features\"]\n", "# rand_colors = [\"#8080801A\"] * niter_rand + [\"#E69B04\"]\n", "\n", "# _, background_aurocs, _, background_auprs, _ = plot_utils.roc_pr_curves(\n", "# modeling_xaxis, rand_tpr_plotting, rand_prec_plotting, rand_names, model_colors=rand_colors,\n", "# prc_chance=prc_chance, figax=([fig, fig], ax_list)\n", "# )\n", "\n", "# plot_utils.add_letter(ax_list[0], -0.15, 1.03, \"c\")\n", "# plot_utils.add_letter(ax_list[1], -0.15, 1.03, \"d\")\n", "\n", "# # KS test, null hypothesis: random AUROCs and AUPRs are normally distributed\n", "# # One-tailed Z-test that the real data is drawn from this distribution\n", "# for data, name in zip([background_aurocs, background_auprs], [\"AUROC\", \"AUPR\"]):\n", "# real, rand = data[niter_rand], data[:niter_rand]\n", "# dstat, pval = stats.kstest(stats.zscore(rand), \"norm\")\n", "# print(f\"{name}s of random features are normally distributed, KS test p = {pval:.2f}, D = {dstat:.2f}\")\n", "# zscore = (real - np.mean(rand)) / np.std(rand)\n", "# pval = stats.norm.cdf(-np.abs(zscore))\n", "# print(f\"Probability that the {name} of the real features is drawn from the background distribution, one-tailed Z-test p = {pval:2f}\")\n", "\n", "# display(fig)\n", "# plt.close()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Strong enhancers are characterized by diverse total motif content\n", "\n", "To understand how these eight TF motifs differentiate strong enhancers from silencers, we first calculated the total predicted occupancy of each sequence by all eight lineage-defining TFs and compared the different activity classes. Strong enhancers and silencers both have higher total predicted occupancies than inactive sequences, but total predicted occupancies do not distinguish strong enhancers and silencers from each other ([Figure 2b](#fig2), [Supplementary file 5](#supp5)). Since strong enhancers are enriched for several motifs relative to silencers, this suggests that strong enhancers are distinguished from silencers by the diversity of their motifs, rather than the total number.\n", "\n", "We considered two hypotheses for how the more diverse collection of motifs function in strong enhancers: either strong enhancers depend on specific combinations of TF motifs (‘TF identity hypothesis’) or they instead must be co-occupied by multiple lineage-defining TFs, regardless of TF identity (‘TF diversity hypothesis’). To distinguish between these hypotheses, we examined which specific motifs contribute to the total motif content of strong enhancers and silencers. We considered motifs for a TF present in a sequence if the TF predicted occupancy was above 0.5 molecules ([Supplementary file 4](#supp4)), which generally corresponds to at least one motif with a relative _K_~D~ above 3%. This threshold captures the effect of low affinity motifs that are often biologically relevant [@bib10; @bib15; @bib16; @bib63]. As expected, 97% of strong enhancers and silencers contain CRX motifs since the sequences were selected based on CRX binding or significant matches to the CRX PWM within open chromatin ([Figure 2c](#fig2)). Compared to silencers, strong enhancers contain a broader diversity of motifs for the eight lineage-defining TFs ([Figure 2c](#fig2)). However, while strong enhancers contain a broader range of motifs, no single motif occurs in a majority of strong enhancers: NRL motifs are present in 23% of strong enhancers, NeuroD1 and RORB in 18% each, and MAZ in 16%. Additionally, none of the motifs tend to co-occur as pairs in strong enhancers: no specific pair occurred in more than 5% of sequences ([Figure 2d](#fig2)). We also did not observe a bias in the linear arrangement of motifs in strong enhancers (Materials and methods). Similarly, no single motif occurs in more than 15% of silencers ([Figure 2c](#fig2)). These results suggest that strong enhancers are defined by the diversity of their motifs, and not by specific motif combinations or their linear arrangement.\n", "\n", "The results above predict that strong enhancers are more likely to be bound by a diverse but degenerate collection of TFs, compared with silencers or inactive sequences. We tested this prediction by examining in vivo TF binding using published ChIP-seq data for NRL [@bib23] and MEF2D [@bib2]. Consistent with the prediction, sequences bound by CRX and either NRL or MEF2D are approximately twice as likely to be strong enhancers compared to sequences only bound by CRX ([Figure 2e](#fig2)). Sequences bound by all three TFs are the most likely to be strong or weak enhancers rather than silencers or inactive sequences. However, most strong enhancers are not bound by either NRL or MEF2D ([Figure 2f](#fig2)), indicating that binding of these TFs is not required for strong enhancers. Our results support the TF diversity hypothesis: CRX-targeted enhancers are co-occupied by multiple TFs, without a requirement for specific combinations of lineage-defining TFs.\n", "\n", "## Strong enhancers have higher motif information content than silencers\n", "\n", "Our results indicate that both strong enhancers and silencers have a higher total motif content than inactive sequences, while strong enhancers contain a more diverse collection of motifs than silencers. To quantify these differences in the number and diversity of motifs, we computed the information content of CRX-targeted sequences using Boltzmann entropy. The Boltzmann entropy of a system is related to the number of ways the system’s molecules can be arranged, which increases with either the number or diversity of molecules ([@bib67], Chapter 5). In our case, each TF is a different type of molecule and the number of each TF is represented by its predicted occupancy for a _cis_-regulatory sequence. The number of molecular arrangements is thus _W_, the number of distinguishable permutations that the TFs can be ordered on the sequence, and the information content of a sequence is then log~2~_W_ (Materials and methods).\n", "\n", "We found that on average, strong enhancers have higher information content than both silencers and inactive sequences (Mann-Whitney U test, p = 1 × 10^–23^ and 7 × 10^–34^, respectively, [Figure 3a](#fig3), [Supplementary file 5](#supp5)), confirming that information content captures the effect of both the number and diversity of motifs. Quantitatively, the average silencer and inactive sequence contains 1.6 and 1.4 bits, respectively, which represents approximately three total motifs for two TFs. Strong enhancers contain on average 2.4 bits, representing approximately three total motifs for three TFs or four total motifs for two TFs. To compare the predictive value of our information content metric to the model based on all eight motifs, we trained a logistic regression model and found that information content classifies strong enhancers from silencers with an AUROC of 0.634 ± 0.008 and an AUPR of 0.663 ± 0.014 ([Figure 3b](#fig3) and [Figure 3—figure supplement 1](#fig3)). This is only slightly worse than the model trained on eight TF occupancies despite an eightfold reduction in the number of features, which is itself comparable to the SVM with 2080 features. The difference between the two logistic regression models suggests that the specific identities of TF motifs make some contribution to the eight TF model, but that most of the signal captured by the SVM can be described with a single metric that does not assign weights to specific motifs. Information content also distinguishes strong enhancers from inactive sequences (AUROC 0.658 ± 0.012, AUPR 0.675 ± 0.019, [Figure 3b](#fig3) and [Figure 3—figure supplement 1](#fig3)). These results indicate that strong enhancers are characterized by higher information content, which reflects both the total number and diversity of motifs." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "caption": "### Figure 3: Information content classifies strong enhancers.\n\n(**a**) Information content for different activity classes. (**b**) Receiver operating characteristic of information content to classify strong enhancers from silencers (orange) or inactive sequences (indigo).\n\n### Figure 3—figure supplement 1: Precision recall curve of logistic regression classifier using information content.\n\nOrange, strong enhancer vs. silencer; indigo, strong enhancer vs. inactive; shaded area, 1 standard deviation based on fivefold cross-validation.", "id": "fig3", "label": "Figure 3 and Figure 3—figure supplement 1." }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Information content for each class:\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>count</th>\n", " <th>mean</th>\n", " <th>std</th>\n", " <th>min</th>\n", " <th>25%</th>\n", " <th>50%</th>\n", " <th>75%</th>\n", " <th>max</th>\n", " </tr>\n", " <tr>\n", " <th>group_name_WT</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>Silencer</th>\n", " <td>837.0</td>\n", " <td>1.554721</td>\n", " <td>1.872824</td>\n", " <td>0.000173</td>\n", " <td>0.195721</td>\n", " <td>0.952877</td>\n", " <td>2.240308</td>\n", " <td>15.248629</td>\n", " </tr>\n", " <tr>\n", " <th>Inactive</th>\n", " <td>928.0</td>\n", " <td>1.385812</td>\n", " <td>1.646322</td>\n", " <td>0.000105</td>\n", " <td>0.150796</td>\n", " <td>0.841681</td>\n", " <td>2.050814</td>\n", " <td>14.738741</td>\n", " </tr>\n", " <tr>\n", " <th>Weak enhancer</th>\n", " <td>1360.0</td>\n", " <td>1.496780</td>\n", " <td>1.683849</td>\n", " <td>0.000008</td>\n", " <td>0.201747</td>\n", " <td>1.014613</td>\n", " <td>2.216628</td>\n", " <td>17.960698</td>\n", " </tr>\n", " <tr>\n", " <th>Strong enhancer</th>\n", " <td>1051.0</td>\n", " <td>2.383258</td>\n", " <td>2.178600</td>\n", " <td>0.000173</td>\n", " <td>0.635291</td>\n", " <td>1.836731</td>\n", " <td>3.453384</td>\n", " <td>13.082139</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " count mean std min 25% 50% \\\n", "group_name_WT \n", "Silencer 837.0 1.554721 1.872824 0.000173 0.195721 0.952877 \n", "Inactive 928.0 1.385812 1.646322 0.000105 0.150796 0.841681 \n", "Weak enhancer 1360.0 1.496780 1.683849 0.000008 0.201747 1.014613 \n", "Strong enhancer 1051.0 2.383258 2.178600 0.000173 0.635291 1.836731 \n", "\n", " 75% max \n", "group_name_WT \n", "Silencer 2.240308 15.248629 \n", "Inactive 2.050814 14.738741 \n", "Weak enhancer 2.216628 17.960698 \n", "Strong enhancer 3.453384 13.082139 " ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Strong enhancers and silencers have the same information content, Mann-Whitney U test p = 1e-23 U = 557959.00\n", "Strong enhancers and inactive sequences have the same information content, Mann-Whitney U test p = 7e-34, U = 641607.00\n", "Model metrics:\n", "Strong vs.\n", "silencer\tAUROC=0.634+/-0.008\tAUPR=0.663+/-0.014\n", "Strong vs.\n", "inactive\tAUROC=0.658+/-0.012\tAUPR=0.675+/-0.019\n", "Figure 3:\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "<Figure size 576x288 with 3 Axes>" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Figure 3--figure supplement 1:\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "<Figure size 288x288 with 1 Axes>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Fit logistic regression models\n", "entropy_clf = LogisticRegression()\n", "entropy_clf, entropy_tpr_list, entropy_prec_list, entropy_f1_list = modeling.train_estimate_variance(entropy_clf, cv, wt_entropy_df.loc[silencer_modeling_mask, \"entropy\"], labels_with_silencer, xaxis, positive_cutoff=0)\n", "\n", "inactive_entropy_clf = LogisticRegression()\n", "inactive_entropy_clf, inactive_entropy_tpr_list, inactive_entropy_prec_list, inactive_entropy_f1_list = modeling.train_estimate_variance(inactive_entropy_clf, cv, wt_entropy_df.loc[inactive_modeling_mask, \"entropy\"], labels_with_inactive, xaxis, positive_cutoff=0)\n", "\n", "# Setup figures\n", "fig, ax_list = plot_utils.setup_multiplot(2, sharex=False, sharey=False)\n", "fig_pr, ax_pr = plt.subplots()\n", "\n", "# 3a: violin plot of information content\n", "print(\"Information content for each class:\")\n", "display(wt_entropy_grouper[\"entropy\"].describe())\n", "\n", "ax = ax_list[0]\n", "fig = plot_utils.violin_plot_groupby(wt_entropy_grouper[\"entropy\"], \"Information content\", class_names=wt_activity_names_oneline, class_colors=color_mapping, figax=(fig, ax))\n", "plot_utils.rotate_ticks(ax.get_xticklabels())\n", "ax.set_yticks(np.arange(0, wt_entropy_df[\"entropy\"].max() + 1, 2))\n", "plot_utils.add_letter(ax, -0.2, 1.03, \"a\")\n", "\n", "# Add ticks above to show the n\n", "ax_twin = ax.twiny()\n", "ax_twin.set_xticks(ax.get_xticks())\n", "ax_twin.set_xlim(ax.get_xlim())\n", "ax_twin.set_xticklabels(wt_activity_count, fontsize=10, rotation=45)\n", "\n", "# Statistics for differences in information content\n", "ustat, pval = stats.mannwhitneyu(wt_entropy_grouper[\"entropy\"].get_group(\"Strong enhancer\"), wt_entropy_grouper[\"entropy\"].get_group(\"Silencer\"), alternative=\"two-sided\")\n", "print(f\"Strong enhancers and silencers have the same information content, Mann-Whitney U test p = {pval:.0e} U = {ustat:.2f}\")\n", "ustat, pval = stats.mannwhitneyu(wt_entropy_grouper[\"entropy\"].get_group(\"Strong enhancer\"), wt_entropy_grouper[\"entropy\"].get_group(\"Inactive\"), alternative=\"two-sided\")\n", "print(f\"Strong enhancers and inactive sequences have the same information content, Mann-Whitney U test p = {pval:.0e}, U = {ustat:.2f}\")\n", "\n", "# 3b: ROC and PR curves with information content vs. two classes\n", "model_data = [\n", " (entropy_tpr_list, entropy_prec_list, \"Strong vs.\\nsilencer\", \"#E69B04\"),\n", " (inactive_entropy_tpr_list, inactive_entropy_prec_list, \"Strong vs.\\ninactive\", plot_utils.set_color(1))\n", "]\n", "\n", "model_tprs, model_precs, model_names, model_colors = zip(*model_data)\n", "ax = ax_list[1]\n", "\n", "# Plot the models\n", "_, model_aurocs, model_aurocs_std, model_auprs, model_auprs_std = plot_utils.roc_pr_curves(\n", " modeling_xaxis, model_tprs, model_precs, model_names, model_colors=model_colors,\n", " figax=([fig, fig_pr], [ax, ax_pr])\n", ")\n", "ax.set_xticks(np.linspace(0, 1, 6))\n", "plot_utils.add_letter(ax, -0.2, 1.03, \"b\")\n", "\n", "# Display model metrics\n", "print(\"Model metrics:\")\n", "for name, auroc, auroc_std, aupr, aupr_std in zip(model_names, model_aurocs, model_aurocs_std, model_auprs, model_auprs_std):\n", " print(f\"{name}\\tAUROC={auroc:.3f}+/-{auroc_std:.3f}\\tAUPR={aupr:.3f}+/-{aupr_std:.3f}\")\n", " \n", "print(\"Figure 3:\")\n", "fig.tight_layout()\n", "display(fig)\n", "print(\"Figure 3--figure supplement 1:\")\n", "display(fig_pr)\n", "plt.close()\n", "plt.close()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Strong enhancers require high information content but not NRL motifs\n", "\n", "Our results show that except for CRX, none of the lineage-defining motifs occur in a majority of strong enhancers. However, all sequences were tested in reporter constructs with the _Rho_ promoter, which contains an NRL motif and three CRX motifs [@bib9; @bib47]. Since NRL is a key co-regulator with CRX in rod photoreceptors, we tested whether strong enhancers generally require NRL, which would be inconsistent with our TF diversity hypothesis. We removed the NRL motif by recloning our MPRA library without the basal _Rho_ promoter. If strong enhancers require an NRL motif for high activity, then only CRX-targeted sequences with NRL motifs will drive reporter expression. If information content (i.e. total motif content and diversity) is the primary determinant of strong enhancers, only CRX-targeted sequences with sufficient motif diversity, measured by information content, will drive reporter expression regardless of whether or not NRL motifs are present.\n", "\n", "We replaced the _Rho_ promoter with a minimal 23 bp polylinker sequence between our libraries and _DsRed_, and repeated the MPRA ([Figure 1—figure supplement 1](#fig1s1), [Supplementary file 3](#supp3)). CRX-targeted sequences were designated as ‘autonomous’ if they retained activity in the absence of the _Rho_ promoter (log~2~(RNA/DNA) > 0, Materials and methods). We found that 90% of autonomous sequences are from the enhancer class, while less than 3% of autonomous sequences are from the silencer class ([Figure 4a](#fig4)). This confirms that the distinction between silencers and enhancers does not depend on the _Rho_ promoter, which is consistent with our previous finding that CRX-targeted silencers repress other promoters [@bib32; @bib86]. However, while most autonomous sequences are enhancers, only 39% of strong enhancers and 9% of weak enhancers act autonomously. Consistent with a role for information content, autonomous strong enhancers have higher information content (Mann-Whitney U test p = 4 × 10^–8^, [Figure 4b](#fig4)) and higher predicted CRX occupancy (Mann-Whitney U test p = 9 × 10^–12^, [Figure 4c](#fig4)) than non-autonomous strong enhancers. We found no evidence that specific lineage-defining motifs are required for autonomous activity, including NRL, which is present in only 25% of autonomous strong enhancers ([Figure 4d](#fig4)). Similarly, NRL ChIP-seq binding [@bib23] occurs more often among autonomous strong enhancers (41% vs. 19%, Fisher’s exact test p = 2 × 10^–14^, odds ratio = 3.0), yet NRL binding still only accounts for a minority of these sequences. We thus conclude that strong enhancers require high information content, rather than any specific lineage-defining motifs." ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "caption": "### Sequence features of autonomous and non-autonomous strong enhancers.\n\n(**a**) Activity of library in the presence (x-axis) or absence (y-axis) of the _Rho_ promoter. Dark blue, strong enhancers; light blue, weak enhancers; green, inactive; red, silencers; gray, ambiguous; horizontal line, cutoff for autonomous activity. Points on the far left and/or very bottom are sequences that were present in the plasmid pool but not detected in the RNA. (**b–d**) Comparison of autonomous and non-autonomous strong enhancers for information content (**b**), predicted cone-rod homeobox (CRX) occupancy (**c**), and frequency of transcription factor (TF) motifs (**d**).", "id": "fig4", "label": "Figure 4." }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Correlation between WT activity with Rho vs. Polylinker:\n", "PCC = 0.338\n", "SCC = 0.359\n", "n = 4751\n", "Fraction of autonomous sequences belonging to each activity class:\n" ] }, { "data": { "text/plain": [ "Strong enhancer 0.693103\n", "Weak enhancer 0.208621\n", "Inactive 0.070690\n", "Silencer 0.027586\n", "Name: group_name_WT, dtype: float64" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Fraction of each activity class that has autonomous activity:\n" ] }, { "data": { "text/plain": [ "group_name_WT\n", "Silencer 0.019394\n", "Inactive 0.044565\n", "Weak enhancer 0.090705\n", "Strong enhancer 0.387657\n", "Name: autonomous_activity, dtype: float64" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Information content of autonomous and non-autonomous strong enhancers:\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>count</th>\n", " <th>mean</th>\n", " <th>std</th>\n", " <th>min</th>\n", " <th>25%</th>\n", " <th>50%</th>\n", " <th>75%</th>\n", " <th>max</th>\n", " </tr>\n", " <tr>\n", " <th>autonomous_activity</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>False</th>\n", " <td>635.0</td>\n", " <td>2.073301</td>\n", " <td>1.964160</td>\n", " <td>0.000173</td>\n", " <td>0.488725</td>\n", " <td>1.624789</td>\n", " <td>3.026204</td>\n", " <td>11.747577</td>\n", " </tr>\n", " <tr>\n", " <th>True</th>\n", " <td>402.0</td>\n", " <td>2.888074</td>\n", " <td>2.424544</td>\n", " <td>0.000346</td>\n", " <td>0.990757</td>\n", " <td>2.272392</td>\n", " <td>4.401275</td>\n", " <td>13.082139</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " count mean std min 25% 50% \\\n", "autonomous_activity \n", "False 635.0 2.073301 1.964160 0.000173 0.488725 1.624789 \n", "True 402.0 2.888074 2.424544 0.000346 0.990757 2.272392 \n", "\n", " 75% max \n", "autonomous_activity \n", "False 3.026204 11.747577 \n", "True 4.401275 13.082139 " ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Autonomous and non-autonomous strong enhancers have the same information content, Mann-Whitney U test p=4e-08, U=101739.00\n", "Predicted CRX occupancy of autonomous and non-autonomous strong enhancers:\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>count</th>\n", " <th>mean</th>\n", " <th>std</th>\n", " <th>min</th>\n", " <th>25%</th>\n", " <th>50%</th>\n", " <th>75%</th>\n", " <th>max</th>\n", " </tr>\n", " <tr>\n", " <th>autonomous_activity</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>False</th>\n", " <td>635.0</td>\n", " <td>2.34943</td>\n", " <td>1.154518</td>\n", " <td>0.003694</td>\n", " <td>1.471752</td>\n", " <td>2.255551</td>\n", " <td>3.075332</td>\n", " <td>7.368500</td>\n", " </tr>\n", " <tr>\n", " <th>True</th>\n", " <td>402.0</td>\n", " <td>2.83343</td>\n", " <td>1.127028</td>\n", " <td>0.015596</td>\n", " <td>2.062315</td>\n", " <td>2.858271</td>\n", " <td>3.554521</td>\n", " <td>5.852791</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " count mean std min 25% 50% \\\n", "autonomous_activity \n", "False 635.0 2.34943 1.154518 0.003694 1.471752 2.255551 \n", "True 402.0 2.83343 1.127028 0.015596 2.062315 2.858271 \n", "\n", " 75% max \n", "autonomous_activity \n", "False 3.075332 7.368500 \n", "True 3.554521 5.852791 " ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Autonomous and non-autonomous strong enhancers have the same predicted CRX occupancy, Mann-Whitney U test p=9e-12, U=95541.00\n", "Strong enhancers with autonomous and non-autonomous activity vs. NRL bound and unbound:\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th>nrl_bound</th>\n", " <th>False</th>\n", " <th>True</th>\n", " </tr>\n", " <tr>\n", " <th>autonomous_activity</th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>False</th>\n", " <td>513</td>\n", " <td>122</td>\n", " </tr>\n", " <tr>\n", " <th>True</th>\n", " <td>236</td>\n", " <td>166</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ "nrl_bound False True \n", "autonomous_activity \n", "False 513 122\n", "True 236 166" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Fisher's exact test that NRL binding and strong enhancer autonomous activity are independent, p=2e-14, odds ratio=3.0\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "<Figure size 576x576 with 8 Axes>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Keep sequences where the WT and Polylinker were both measured\n", "poly_measured_mask = activity_df[[\"expression_log2_WT\", \"expression_log2_POLY\"]].notna().all(axis=1)\n", "activity_poly_df = activity_df[poly_measured_mask]\n", "wt_occupancy_poly_df = wt_occupancy_df[poly_measured_mask]\n", "wt_entropy_poly_df = wt_entropy_df[poly_measured_mask]\n", "\n", "# Setup the figure\n", "fig, ax_list = plot_utils.setup_multiplot(4, sharex=False, sharey=False)\n", "ax_list = ax_list.flatten()\n", "\n", "# 4a: scatterplot of Rho vs. Polylinker\n", "ax = ax_list[0]\n", "print(\"Correlation between WT activity with Rho vs. Polylinker:\")\n", "fig, ax = plot_utils.scatter_with_corr(activity_poly_df[\"expression_log2_WT\"], activity_poly_df[\"expression_log2_POLY\"], \"log2 Enhancer Activity/Rho\", \"log2 Autonomous Activity\", colors=activity_poly_df[\"plot_color_WT\"], xticks=rho_ticks, figax=(fig, ax))\n", "ax.axhline(0, color=\"k\", linestyle=\"--\")\n", "plot_utils.add_letter(ax, -0.2, 1.03, \"a\")\n", "\n", "# Display some numbers for the manuscript\n", "print(\"Fraction of autonomous sequences belonging to each activity class:\")\n", "display(activity_poly_df.loc[activity_poly_df[\"autonomous_activity\"], \"group_name_WT\"].value_counts(normalize=True))\n", "\n", "print(\"Fraction of each activity class that has autonomous activity:\")\n", "display(activity_poly_df.groupby(\"group_name_WT\")[\"autonomous_activity\"].apply(lambda x: x.sum() / len(x)))\n", "\n", "# Information content of strong autonomous vs. non-autonomous\n", "# Set up grouping\n", "strong_enh_poly_mask = activity_poly_df[\"group_name_WT\"].str.contains(\"Strong\")\n", "strong_enh_poly_mask = strong_enh_poly_mask & strong_enh_poly_mask.notna()\n", "autonomous_occ_grouper = wt_occupancy_poly_df[strong_enh_poly_mask].groupby(activity_poly_df.loc[strong_enh_poly_mask, \"autonomous_activity\"])\n", "autonomous_entropy_grouper = wt_entropy_poly_df[strong_enh_poly_mask].groupby(activity_poly_df.loc[strong_enh_poly_mask, \"autonomous_activity\"])\n", "\n", "# Set up for plotting\n", "strong_color = color_mapping[\"Strong enhancer\"]\n", "autonomous_names = [\"Non-autonomous \", \" Autonomous\"]\n", "autonomous_counts = [len(i) for i in autonomous_occ_grouper.groups.values()]\n", "\n", "# Do stats for difference in IC\n", "print(\"Information content of autonomous and non-autonomous strong enhancers:\")\n", "display(autonomous_entropy_grouper[\"entropy\"].describe())\n", "ustat, pval = stats.mannwhitneyu(*[j for i, j in autonomous_entropy_grouper[\"entropy\"]], alternative=\"two-sided\")\n", "print(f\"Autonomous and non-autonomous strong enhancers have the same information content, Mann-Whitney U test p={pval:.0e}, U={ustat:.2f}\")\n", "\n", "# 4b: Make the plot\n", "ax = ax_list[1]\n", "fig = plot_utils.violin_plot_groupby(autonomous_entropy_grouper[\"entropy\"], \"Information content\", class_names=autonomous_names, class_colors=[strong_color]*2, figax=(fig, ax))\n", "ax.set_xlabel(\"Strong enhancers\")\n", "# Add ticks for the n\n", "ax_twin = ax.twiny()\n", "ax_twin.set_xticks(ax.get_xticks())\n", "ax_twin.set_xlim(ax.get_xlim())\n", "ax_twin.set_xticklabels(autonomous_counts, fontsize=10, rotation=45)\n", "plot_utils.add_letter(ax, -0.2, 1.03, \"b\")\n", "\n", "# Differences in CRX occupancy\n", "print(\"Predicted CRX occupancy of autonomous and non-autonomous strong enhancers:\")\n", "display(autonomous_occ_grouper[\"CRX\"].describe())\n", "ustat, pval = stats.mannwhitneyu(*[j for i, j in autonomous_occ_grouper[\"CRX\"]], alternative=\"two-sided\")\n", "print(f\"Autonomous and non-autonomous strong enhancers have the same predicted CRX occupancy, Mann-Whitney U test p={pval:.0e}, U={ustat:.2f}\")\n", "\n", "# 4c\n", "ax = ax_list[2]\n", "fig = plot_utils.violin_plot_groupby(autonomous_occ_grouper[\"CRX\"], \"Predicted CRX occupancy\", class_names=autonomous_names, class_colors=[strong_color]*2, figax=(fig, ax))\n", "ax.set_xlabel(\"Strong enhancers\")\n", "ax.set_yticks(np.arange(8))\n", "# Add ticks for the n\n", "ax_twin = ax.twiny()\n", "ax_twin.set_xticks(ax.get_xticks())\n", "ax_twin.set_xlim(ax.get_xlim())\n", "ax_twin.set_xticklabels(autonomous_counts, fontsize=10, rotation=45)\n", "plot_utils.add_letter(ax, -0.2, 1.03, \"c\")\n", "\n", "# Differences in motif frequencies\n", "autonomous_motif_freq_df = autonomous_occ_grouper.apply(lambda x: (x > occupied_cutoff).sum() / len(x))\n", "# Sort by the feature importance in the logistic model\n", "autonomous_motif_freq_df = autonomous_motif_freq_df.iloc[:, feature_order]\n", "\n", "# 4d: Make heatmakt, but put CRX separate\n", "ax = ax_list[3]\n", "autonomous_motif_freq_no_crx_df = autonomous_motif_freq_df.drop(columns=\"CRX\") \n", "vmax = 0.25\n", "thresh = vmax / 2\n", "heatmap = ax.imshow(autonomous_motif_freq_no_crx_df.T, aspect=\"auto\", cmap=\"Reds\", vmax=vmax, vmin=0)\n", "ax.set_xlabel(\"Strong enhancers\")\n", "ax.set_xticks(np.arange(len(autonomous_motif_freq_no_crx_df)))\n", "ax.set_xticklabels(autonomous_names)\n", "ax.set_yticks(np.arange(len(autonomous_motif_freq_no_crx_df.columns)))\n", "ax.set_yticklabels(autonomous_motif_freq_no_crx_df.columns)\n", "plot_utils.annotate_heatmap(ax, autonomous_motif_freq_no_crx_df, thresh)\n", "\n", "# Add colorbar\n", "divider = make_axes_locatable(ax)\n", "cax = divider.append_axes(\"right\", size=\"5%\", pad=\"2%\")\n", "colorbar = fig.colorbar(heatmap, cax=cax, label=\"Frequency of motif\")\n", "ticks = cax.get_yticks()\n", "ticks = [f\"{i:.2f}\" for i in ticks]\n", "ticks[-1] = r\"$\\geq$\" + ticks[-1]\n", "cax.set_yticklabels(ticks)\n", "\n", "# Add CRX\n", "cax = divider.append_axes(\"top\", size=\"14%\", pad=\"2%\")\n", "heatmap = cax.imshow(autonomous_motif_freq_df[\"CRX\"].to_frame().T, aspect=\"auto\", cmap=\"Reds\", vmax=vmax, vmin=0)\n", "cax.set_xticks([])\n", "cax.set_yticks([0])\n", "cax.set_yticklabels([\"CRX\"])\n", "plot_utils.annotate_heatmap(cax, autonomous_motif_freq_df[\"CRX\"].to_frame(), thresh)\n", "plot_utils.add_letter(cax, -0.2, 1.03, \"d\")\n", "\n", "# Add ticks for the n\n", "cax.xaxis.tick_top()\n", "cax.set_xticks(ax.get_xticks())\n", "cax.set_xlim(ax.get_xlim())\n", "cax.set_xticklabels(autonomous_counts, fontsize=10, rotation=45)\n", "\n", "# Test relationship between NRL binding and strong enhancer autonomous activity\n", "print(\"Strong enhancers with autonomous and non-autonomous activity vs. NRL bound and unbound:\")\n", "nrl_chip_vs_autonomous = activity_poly_df[strong_enh_poly_mask].groupby(\"autonomous_activity\")[\"nrl_bound\"].value_counts().unstack()\n", "display(nrl_chip_vs_autonomous)\n", "oddsratio, pval = stats.fisher_exact(nrl_chip_vs_autonomous)\n", "print(f\"Fisher's exact test that NRL binding and strong enhancer autonomous activity are independent, p={pval:.0e}, odds ratio={oddsratio:.1f}\")\n", "fig.tight_layout()\n", "display(fig)\n", "plt.close()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## TF motifs contribute independently to strong enhancers\n", "\n", "Our results indicate that information content distinguishes strong enhancers from silencers and inactive sequences. Information content only takes into account the total number and diversity of motifs in a sequence and not any potential interactions between them. The classification success of information content thus suggests that each TF motif will contribute independently to enhancer activity. We tested this prediction with CRX-targeted sequences where all CRX motifs were abolished by point mutation ([Supplementary file 3](#supp3)). Consistent with our previous work [@bib85], mutating CRX motifs causes the activities of both enhancers and silencers to regress toward basal levels (Pearson’s _r_ = 0.608, [Figure 5a](#fig5)), indicating that most enhancers and silencers show some dependence on CRX. However, 40% of wild-type strong enhancers show low CRX dependence and remain strong enhancers with their CRX motifs abolished. Although strong enhancers with high and low CRX dependence have similar wild-type information content ([Figure 5b](#fig5)), strong enhancers with low CRX dependence have lower predicted CRX occupancy than those with high CRX dependence (Mann-Whitney U test p = 2 × 10^–9^, [Figure 5c](#fig5)), and also have higher ‘residual’ information content (i.e. information content without CRX motifs, Mann-Whitney U test p = 1 × 10^–7^, [Figure 5d](#fig5)). Low CRX dependence sequences have an average of 1.5 residual bits, which corresponds to three motifs for two TFs, while high CRX dependence sequences have an average of 1.0 residual bits, which corresponds to two motifs for two TFs ([Figure 5e](#fig5))." ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "caption": "### Independence of transcription factor (TF) motifs in strong enhancers.\n\n(**a**) Activity of sequences with and without cone-rod homeobox (CRX) motifs. Points are colored by the activity group with CRX motifs intact: dark blue, strong enhancers; light blue, weak enhancers; green, inactive; red, silencers; gray, ambiguous; horizontal dotted lines and color bar represent the cutoffs for the same groups when CRX motifs are mutated. Solid black line is the y = x line. (**b–d**) Comparison of strong enhancers with high and low CRX dependence for information content (**b**), predicted CRX occupancy (**c**), and residual information content (**d**). (**e**) Representative strong enhancers with high (top) or low (bottom) CRX dependence.", "id": "fig5", "label": "Figure 5." }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Correlation between WT and MUT activities:\n", "PCC = 0.608\n", "SCC = 0.706\n", "n = 4123\n", "Information content of strong enhancers with different mutant activities:\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>count</th>\n", " <th>mean</th>\n", " <th>std</th>\n", " <th>min</th>\n", " <th>25%</th>\n", " <th>50%</th>\n", " <th>75%</th>\n", " <th>max</th>\n", " </tr>\n", " <tr>\n", " <th>group_name_MUT</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>False</th>\n", " <td>586.0</td>\n", " <td>2.321663</td>\n", " <td>2.067846</td>\n", " <td>0.000346</td>\n", " <td>0.641760</td>\n", " <td>1.849581</td>\n", " <td>3.333561</td>\n", " <td>11.676515</td>\n", " </tr>\n", " <tr>\n", " <th>True</th>\n", " <td>344.0</td>\n", " <td>2.857066</td>\n", " <td>2.411316</td>\n", " <td>0.001591</td>\n", " <td>1.145032</td>\n", " <td>2.413095</td>\n", " <td>3.969305</td>\n", " <td>13.082139</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " count mean std min 25% 50% \\\n", "group_name_MUT \n", "False 586.0 2.321663 2.067846 0.000346 0.641760 1.849581 \n", "True 344.0 2.857066 2.411316 0.001591 1.145032 2.413095 \n", "\n", " 75% max \n", "group_name_MUT \n", "False 3.333561 11.676515 \n", "True 3.969305 13.082139 " ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Predicted CRX occupancy of strong enhancers with different mutant activities:\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>count</th>\n", " <th>mean</th>\n", " <th>std</th>\n", " <th>min</th>\n", " <th>25%</th>\n", " <th>50%</th>\n", " <th>75%</th>\n", " <th>max</th>\n", " </tr>\n", " <tr>\n", " <th>group_name_MUT</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>False</th>\n", " <td>586.0</td>\n", " <td>2.876820</td>\n", " <td>1.069855</td>\n", " <td>0.927761</td>\n", " <td>2.085474</td>\n", " <td>2.857976</td>\n", " <td>3.575210</td>\n", " <td>7.368500</td>\n", " </tr>\n", " <tr>\n", " <th>True</th>\n", " <td>344.0</td>\n", " <td>2.454016</td>\n", " <td>1.009155</td>\n", " <td>0.964684</td>\n", " <td>1.626097</td>\n", " <td>2.338718</td>\n", " <td>3.110073</td>\n", " <td>5.730406</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " count mean std min 25% 50% \\\n", "group_name_MUT \n", "False 586.0 2.876820 1.069855 0.927761 2.085474 2.857976 \n", "True 344.0 2.454016 1.009155 0.964684 1.626097 2.338718 \n", "\n", " 75% max \n", "group_name_MUT \n", "False 3.575210 7.368500 \n", "True 3.110073 5.730406 " ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Strong enhancers that remain strong vs. do not have the same CRX occupancy, Mann-Whitney U test p=2e-09, U=124411.00\n", "Residual information content of strong enhancers with different mutant activities:\n" ] }, { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>count</th>\n", " <th>mean</th>\n", " <th>std</th>\n", " <th>min</th>\n", " <th>25%</th>\n", " <th>50%</th>\n", " <th>75%</th>\n", " <th>max</th>\n", " </tr>\n", " <tr>\n", " <th>group_name_MUT</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>False</th>\n", " <td>586.0</td>\n", " <td>1.026129</td>\n", " <td>1.283253</td>\n", " <td>0.000001</td>\n", " <td>0.097540</td>\n", " <td>0.493644</td>\n", " <td>1.472322</td>\n", " <td>7.129248</td>\n", " </tr>\n", " <tr>\n", " <th>True</th>\n", " <td>344.0</td>\n", " <td>1.536551</td>\n", " <td>1.638575</td>\n", " <td>0.000136</td>\n", " <td>0.264872</td>\n", " <td>1.046836</td>\n", " <td>2.338690</td>\n", " <td>9.819172</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " count mean std min 25% 50% \\\n", "group_name_MUT \n", "False 586.0 1.026129 1.283253 0.000001 0.097540 0.493644 \n", "True 344.0 1.536551 1.638575 0.000136 0.264872 1.046836 \n", "\n", " 75% max \n", "group_name_MUT \n", "False 1.472322 7.129248 \n", "True 2.338690 9.819172 " ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Strong enhancers that stay strong vs. do not have the same residual information content, Mann-Whitney U test p=1e-07, U=79938.00\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "<Figure size 504x720 with 10 Axes>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Helper functions to visualize sequence\n", "def hex_to_rgb(hexcode):\n", " return tuple(int(hexcode[i:i+2], 16) / 255 for i in (1, 3, 5))\n", "\n", "strong_color_rgb = hex_to_rgb(strong_color)\n", "weak_color_rgb = hex_to_rgb(color_mapping[\"Weak enhancer\"])\n", "crx_color = mpl.colors.to_rgb(\"orange\")\n", "other_tf_color = mpl.colors.to_rgb(\"red\")\n", "\n", "def visualize_sequence(seq_id, ax, title, below_text, basecolor):\n", " seq_occupancy_df = predicted_occupancy.total_landscape(all_seqs[seq_id], ewms, mu)\n", " visual = np.full(((len(seq_occupancy_df), 3)), basecolor) # (number of positions, RGB values)\n", " text_mapping = [] # (name of TF, center position of motif)\n", " # Loop over each TF, identify motifs, and fill in the representation with the predicted occupancy for the full motif\n", " for col in seq_occupancy_df:\n", " tf, orient = col.split(\"_\")\n", " for motif_start, occ in seq_occupancy_df[col].iteritems():\n", " if occ > occupied_cutoff:\n", " motif_end = motif_start + motif_len[tf]\n", " # Check and make sure all positions of the motif are zeros\n", " if (visual[motif_start:motif_end] != basecolor).all(axis=1).any():\n", " print(f\"Error, motif already in the range {motif_start}-{motif_end}! Skipping.\")\n", " else:\n", " color = crx_color if tf == \"CRX\" else other_tf_color\n", " visual[motif_start:motif_end] = color\n", " text_mapping.append((tf, (motif_start + motif_end) / 2))\n", " \n", " heatmap = ax.imshow(visual[np.newaxis, :], aspect=\"auto\", cmap=\"Reds\")\n", " ax.set_yticks([])\n", " # Add text showing which motif is where\n", " for tf, x in text_mapping:\n", " ax.text(x, 0, tf, ha=\"center\", va=\"center\", color=\"white\", rotation=90)\n", "\n", " ax.set_title(title)\n", " ax.set_xlabel(below_text)\n", " \n", " return ax, heatmap\n", "\n", "# Setup for sequences where both WT and MUT was measured\n", "wt_mut_mask = activity_df[\"wt_vs_mut_log2\"].notna()\n", "activity_wt_mut_measured_df = activity_df[wt_mut_mask]\n", "wt_occ_mut_measured_df = wt_occupancy_df[wt_mut_mask]\n", "wt_entropy_mut_measured_df = wt_entropy_df[wt_mut_mask]\n", "mut_entropy_measured_df = mut_entropy_df[wt_mut_mask]\n", "\n", "# Figure setup\n", "gs_kw = dict(height_ratios=[5, 5, 1, 1])\n", "fig, ax_list = plt.subplots(nrows=4, ncols=2, figsize=(7, 10), gridspec_kw=gs_kw)\n", "gs = ax_list[0, 0].get_gridspec()\n", "for row in [2, 3]:\n", " for ax in ax_list[row]:\n", " ax.remove()\n", " \n", "axstrong = fig.add_subplot(gs[2, :])\n", "axweak = fig.add_subplot(gs[3, :])\n", "\n", "# 5a: Scatter plot of WT and MUT activities\n", "ax = ax_list[0, 0]\n", "print(\"Correlation between WT and MUT activities:\")\n", "fig, ax = plot_utils.scatter_with_corr(activity_wt_mut_measured_df[\"expression_log2_WT\"], activity_wt_mut_measured_df[\"expression_log2_MUT\"],\n", " \"log2 WT Activity/Rho\", \"log2 MUT Activity/Rho\", colors=activity_wt_mut_measured_df[\"plot_color_WT\"],\n", " xticks=rho_ticks, yticks=rho_ticks, figax=(fig, ax))\n", "# Plot y = x line\n", "ax.plot(rho_ticks, rho_ticks, color=\"black\", linewidth=1)\n", "# Show cutoffs for different classes\n", "strong_cutoff = activity_df.groupby(\"group_name_WT\")[\"expression_log2_WT\"].get_group(\"Strong enhancer\").min()\n", "for line in [-1, 1, strong_cutoff]:\n", " ax.axhline(line, color=\"black\", linestyle=\"--\", linewidth=1)\n", " \n", "# Add colorbar to show the cutoffs\n", "divider = make_axes_locatable(ax)\n", "color_ax = divider.append_axes(\"right\", size=\"5%\")\n", "color_ax.set_ylim(ax.get_ylim())\n", "color_ax.barh([(-1 - ax.get_ylim()[0]) / 2 + ax.get_ylim()[0], 0, (strong_cutoff - 1) / 2 + 1, (ax.get_ylim()[1] - strong_cutoff) / 2 + strong_cutoff], # Midpoint of the bars\n", " [1, 1, 1, 1], # Bar height\n", " [-1 - ax.get_ylim()[0], 2, strong_cutoff - 1, ax.get_ylim()[1] - strong_cutoff], # Bar width\n", " color=color_mapping)\n", "color_ax.set_xticks([])\n", "color_ax.set_yticks([])\n", "color_ax.set_xlim(right=1)\n", "\n", "plot_utils.add_letter(ax, -0.35, 1.03, \"a\")\n", "\n", "# Setup strong enhancer->mutant activity groupings\n", "strong_mask = activity_wt_mut_measured_df[\"group_name_WT\"].str.contains(\"Strong\")\n", "strong_mask = strong_mask & strong_mask.notna()\n", "activity_strong_df = activity_wt_mut_measured_df[strong_mask]\n", "\n", "# Group the data based on CRX-dependence (whether or not it stay strong) and name the groups accordingly\n", "stay_strong_mask = activity_strong_df[\"group_name_MUT\"].str.contains(\"Strong\") & activity_strong_df[\"group_name_MUT\"].notna()\n", "wt_occ_strong_grouper = wt_occ_mut_measured_df[strong_mask].groupby(stay_strong_mask)\n", "wt_entropy_strong_grouper = wt_entropy_mut_measured_df[strong_mask].groupby(stay_strong_mask)\n", "strong_mutant_names = wt_entropy_strong_grouper[\"entropy\"].count().rename({False: \"High\", True: \"Low\"}).index.values.tolist()\n", "strong_mutant_counts = wt_entropy_strong_grouper[\"entropy\"].count().astype(int).values\n", "\n", "# Differences in information content\n", "print(\"Information content of strong enhancers with different mutant activities:\")\n", "display(wt_entropy_strong_grouper[\"entropy\"].describe())\n", "\n", "# 5b: Information content\n", "ax = ax_list[0, 1]\n", "fig = plot_utils.violin_plot_groupby(wt_entropy_strong_grouper[\"entropy\"], \"Information content\", class_names=strong_mutant_names, class_colors=color_mapping[[\"Weak enhancer\", \"Strong enhancer\"]], figax=(fig, ax))\n", "ax.set_xlabel(\"CRX-dependence\\nStrong enhancers\")\n", "ax.set_yticks(np.arange(0, 13, 2))\n", "ax_twin = ax.twiny()\n", "ax_twin.set_xticks(ax.get_xticks())\n", "ax_twin.set_xlim(ax.get_xlim())\n", "ax_twin.set_xticklabels(strong_mutant_counts, fontsize=10, rotation=45)\n", "plot_utils.add_letter(ax, -0.25, 1.03, \"b\")\n", "\n", "# Differences in predicted CRX occupancy of strong enhancers with different CRX-dependences\n", "print(\"Predicted CRX occupancy of strong enhancers with different mutant activities:\")\n", "display(wt_occ_strong_grouper[\"CRX\"].describe())\n", "ustat, pval = stats.mannwhitneyu(*[j for i, j in wt_occ_strong_grouper[\"CRX\"]], alternative=\"two-sided\")\n", "print(f\"Strong enhancers that remain strong vs. do not have the same CRX occupancy, Mann-Whitney U test p={pval:.0e}, U={ustat:.2f}\")\n", "\n", "# 5c: predicted CRX occupancy\n", "ax = ax_list[1, 0]\n", "fig = plot_utils.violin_plot_groupby(wt_occ_strong_grouper[\"CRX\"], \"Predicted CRX occupancy\", class_names=strong_mutant_names, class_colors=color_mapping[[\"Weak enhancer\", \"Strong enhancer\"]], figax=(fig, ax))\n", "ax.set_xlabel(\"CRX-dependence\\nStrong enhancers\")\n", "ax.set_yticks(np.arange(0, 8, 2))\n", "ax_twin = ax.twiny()\n", "ax_twin.set_xticks(ax.get_xticks())\n", "ax_twin.set_xlim(ax.get_xlim())\n", "ax_twin.set_xticklabels(strong_mutant_counts, fontsize=10, rotation=45)\n", "plot_utils.add_letter(ax, -0.25, 1.03, \"c\")\n", "\n", "# Differences in redisual IC\n", "print(\"Residual information content of strong enhancers with different mutant activities:\")\n", "mut_entropy_strong_grouper = mut_entropy_measured_df[strong_mask].groupby(stay_strong_mask)\n", "display(mut_entropy_strong_grouper[\"entropy\"].describe())\n", "ustat, pval = stats.mannwhitneyu(*[j for i, j in mut_entropy_strong_grouper[\"entropy\"]], alternative=\"two-sided\")\n", "print(f\"Strong enhancers that stay strong vs. do not have the same residual information content, Mann-Whitney U test p={pval:.0e}, U={ustat:.2f}\")\n", "\n", "# 5d: Residual information content\n", "ax = ax_list[1, 1]\n", "fig = plot_utils.violin_plot_groupby(mut_entropy_strong_grouper[\"entropy\"], \"Residual information content\", class_names=strong_mutant_names, class_colors=color_mapping[[\"Weak enhancer\", \"Strong enhancer\"]], figax=(fig, ax))\n", "ax.set_xlabel(\"CRX-dependence\\nStrong enhancers\")\n", "ax.set_yticks(np.arange(0, 11, 2))\n", "ax_twin = ax.twiny()\n", "ax_twin.set_xticks(ax.get_xticks())\n", "ax_twin.set_xlim(ax.get_xlim())\n", "ax_twin.set_xticklabels(strong_mutant_counts, fontsize=10, rotation=45)\n", "plot_utils.add_letter(ax, -0.25, 1.03, \"d\")\n", "\n", "# 5e Visualize the two depresentative sequences\n", "ax = axstrong\n", "become_weak_example_id = \"chr16-43945747-43945911_UPPE\"\n", "become_weak_text = become_weak_example_id.split(\"_\")[0] + \"\\n\" + f\"{wt_entropy_df.loc[become_weak_example_id, 'entropy']:.1f}\" + \" bits, \" + f\"{mut_entropy_df.loc[become_weak_example_id, 'entropy']:.1f}\" + \" residual bits\"\n", "ax, become_weak_visual = visualize_sequence(become_weak_example_id + \"_WT\", ax, \"High CRX-dependence\", become_weak_text, weak_color_rgb)\n", "plot_utils.add_letter(ax, -0.05, 1.03, \"e\")\n", "\n", "ax = axweak\n", "stay_strong_example_id = \"chr11-114685176-114685340_CPPE\"\n", "stay_strong_text = stay_strong_example_id.split(\"_\")[0] + \"\\n\" + f\"{wt_entropy_df.loc[stay_strong_example_id, 'entropy']:.1f}\" + \" bits, \" + f\"{mut_entropy_df.loc[stay_strong_example_id, 'entropy']:.1f}\" + \" residual bits\"\n", "ax, stay_strong_visual = visualize_sequence(stay_strong_example_id + \"_WT\", ax, \"Low CRX-dependence\", stay_strong_text, strong_color_rgb)\n", "\n", "fig.tight_layout()\n", "display(fig)\n", "plt.close()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Strong enhancers with low and high CRX dependence have similar wild-type information content and similar total predicted occupancy ([Figure 5b and e](#fig5)). As a result, sequences with more CRX motifs have fewer motifs for other TFs, suggesting that there is no evolutionary pressure for enhancers to contain additional motifs beyond the minimum amount of information content required to be active. To test this idea, we calculated the minimum number and diversity of motifs necessary to specify a relatively unique location in the genome [@bib87] and found that a 164 bp sequence only requires five motifs for three TFs (Materials and methods). These motif requirements can be achieved in two ways with similar information content that differ only in the quantitative number of motifs for each TF. In other words, the number of motifs for any particular TF is not important so long as there is sufficient information content. Taken together, we conclude that each TF motif provides an independent contribution toward specifying strong enhancers.\n", "\n", "# Discussion\n", "\n", "Many regions in the genome are bound by TFs and bear the epigenetic hallmarks of active _cis_-regulatory sequences, yet fail to exhibit _cis_-regulatory activity when tested directly. The discrepancy between measured epigenomic state and _cis_-regulatory activity indicates that enhancers and silencers consist of more than the minimal sequence features necessary to recruit TFs and chromatin-modifying factors. Our results show that enhancers, silencers, and inactive sequences in developing photoreceptors can be distinguished by their motif content, even though they are indistinguishable by CRX binding or chromatin accessibility. We show that both enhancers and silencers contain more TF motifs than inactive sequences, and that enhancers also contain more diverse sets of motifs for lineage-defining TFs. These differences are captured by our measure of information content. Information content, as a single metric, identifies strong enhancers nearly as well as an unbiased set of 2080 non-redundant 6-mers used for an SVM, indicating that a simple measure of motif number and diversity can capture the key sequence features that distinguish enhancers from other sequences that lie in open chromatin.\n", "\n", "The results of our information content classifier are consistent with the TF collective model of enhancers [@bib39; @bib78]: globally, active enhancers are specified by the combinatorial action of lineage-defining TFs with little constraint on which motifs must co-occur. We show that CRX-targeted enhancers are distinguished from inactive CRX-targeted sequences by a larger, more diverse collection of TF motifs, and not any specific combination of motifs. This indicates that enhancers are active because they have acquired the necessary number of TF binding motifs, and not because they are defined by a strict regulatory grammar. Sequences with fewer motifs may be bound by CRX and reside within open chromatin, but they lack sufficient TF binding for activity. Such loose constraints would facilitate the de novo emergence of tissue-specific enhancers and silencers over evolution and explain why critical cell type-specific TF interactions, such as CRX and NRL in rod photoreceptors, occur at only a minority of the active enhancers in that cell type [@bib28; @bib32; @bib85].\n", "\n", "Like enhancers, CRX-targeted silencers require higher motif content and are dependent on CRX motifs, but they lack the TF diversity of enhancers. The lack of TF diversity in silencers parallels the architecture of signal-responsive _cis_-regulatory sequences, which are silencers in the absence of a signal and require multiple activators for induction [@bib4]. Consistent with this, we previously showed using synthetic sequences that high occupancy of CRX alone is sufficient to encode silencers while the addition of a single NRL motif converts synthetic silencers to enhancers, and that genomic sequences with very high CRX motif content repress a basal promoter that lacks NRL motifs [@bib86]. We found that photoreceptor genes which are de-repressed upon loss of CRX are located near _cis_-regulatory sequences with high CRX motif content, and that genes near regions that are bound only by CRX are expressed at lower levels than genes near regions co-bound by CRX and NRL [@bib86]. In the current study, we find that silencers in our MPRA library are more likely to occur near de-repressed photoreceptor genes, while strong enhancers are enriched near genes that lose expression in _Crx^-/-^_ retina. These findings suggest that the low TF diversity and high CRX motif content that characterize silencers in our MPRA library are also important for silencing in the genome.\n", "\n", "The contrast in motif diversity between enhancers and silencers that we observe could explain how CRX achieves selective activation and repression of its target genes in multiple cell types and across developmental time points [@bib60; @bib72]. CRX itself is required for silencing, and we previously showed that some silencers become active enhancers in _Crx^-/-^_ retina [@bib86]. The mechanism of CRX-based silencing is unknown, however CRX cooperates with other TFs that can sometimes act as repressors of cell type-specific genes [@bib7; @bib65; @bib84], while other repressors can directly inhibit activation by CRX or its co-activators [@bib12; @bib26; @bib57; @bib75]. In _Drosophila_ photoreceptors, selective silencing of opsin genes is controlled by cell type-specific expression of a repressor, Dve, which acts on the same K50 homeodomain-binding sites as a universally expressed activator, Otd, a homolog of CRX [@bib70]. Other transcriptional activators selectively act as repressors in the same cell type. GATA-1 represses the _GATA-_2 promoter by displacing CREB-binding protein (CBP), while at other genes GATA-1 binds CBP to activate transcription [@bib21]. Selective repression by GATA-1 is also mediated by chromatin occupancy levels and interaction with co-regulators [@bib38], which is consistent with our finding that sequence context enables a TF to both activate and repress genes in the same cell type.\n", "\n", "Given the central role of CRX in selectively regulating genes in multiple closely related cell types [@bib60], we speculate that CRX-targeted silencers may contain sufficient information to act as enhancers in other cell types in which a different set of co-activating TFs are expressed. This hypothesis would be consistent with the finding that many silencers are enhancers in other cell types [@bib11; @bib20; @bib61]. Our work suggests that characterizing sequences by their motif information content offers a way to identify these different classes of _cis_-regulatory sequences in the genome.\n", "\n", "# Materials and methods\n", "\n", "table: Key resources table\n", ":::\n", "| Reagent type (species) or resource | Designation | Source or reference | Identifiers | Additional information |\n", "| ----------------------------------------------------------- | ------------------------------------- | -------------------------------------------------- | ---------------------------------------------------------------------------------- | ---------------------------------------- |\n", "| Strain, strain background (_Mus musculus_, male and female) | CD-1 | Charles River | Strain code 022 | |\n", "| Recombinant DNA reagent | Library1 | This paper | | Listed in [Supplementary file 1](#supp1) |\n", "| Recombinant DNA reagent | Library2 | This paper | | Listed in [Supplementary file 2](#supp2) |\n", "| Recombinant DNA reagent | pJK01_Rhominprox-DsRed | [@bib47] | AddGene plasmid # 173,489 | |\n", "| Recombinant DNA reagent | pJK03\\__Rho_basal\\__DsRed | [@bib47] | AddGene plasmid # 173,490 | |\n", "| Sequence-based reagent | Primers | IDT | | Listed in [Supplementary file 6](#supp6) |\n", "| Commercial assay or kit | Monarch PCR Cleanup Kit | New England Biolabs | T1030S | |\n", "| Commercial assay or kit | Monarch DNA Gel Extraction Kit | New England Biolabs | T1020L | |\n", "| Commercial assay or kit | TURBO DNA-free | Invitrogen | AM1907 | |\n", "| Commercial assay or kit | SuperScript III Reverse Transcriptase | Invitrogen | 18080044 | |\n", "| Software, algorithm | Bedtools | <https://bedtools.readthedocs.io/en/latest/> | RRID:[SCR_006646](https://identifiers.org/RRID/RRID:SCR_006646) | |\n", "| Software, algorithm | MEME Suite | <https://meme-suite.org/> | RRID:[SCR_001783](https://identifiers.org/RRID/RRID:SCR_001783) | |\n", "| Software, algorithm | ShapeMF | <https://github.com/h-samee/shape-motif>, [@bib74] | DOI:[10.1016/j.cels.2018.12.001](https://doi.org/10.1016/j.cels.2018.12.001) | |\n", "| Software, algorithm | Numpy | <https://numpy.org/> | DOI:[10.1038/s41586-020-2649-2](https://doi.org/10.1038/s41586-020-2649-2) | |\n", "| Software, algorithm | Scipy | <https://www.scipy.org/> | DOI:[10.1038/s41592-019-0686-2](https://doi.org/10.1038/s41592-019-0686-2) | |\n", "| Software, algorithm | Pandas | <https://pandas.pydata.org/> | DOI:[10.5281/zenodo.3509134](https://doi.org/10.5281/zenodo.3509134) | |\n", "| Software, algorithm | Matplotlib | <https://matplotlib.org/> | DOI:[10.5281/zenodo.1482099](https://doi.org/10.5281/zenodo.1482099) | |\n", "| Software, algorithm | Logomaker | <https://github.com/jbkinney/logomaker>, [@bib40] | DOI:[10.1093/bioinformatics/btz921](https://doi.org/10.1093/bioinformatics/btz921) | |\n", ":::\n", "{#keyresource}\n", "\n", "## Library design\n", "\n", "CRX ChIP-seq peaks re-processed by [@bib72] were intersected with previously published CRX MPRA libraries [@bib32; @bib85] and one unpublished library to select sequences that had not been previously tested by MPRA. These sequences were scanned for instances of CRX motifs using FIMO version 4.11.2 [@bib3], a p-value cutoff of 2.3 × 10^–3^ (see below), and a CRX PWM derived from an electrophoretic mobility shift assay [@bib49]. We centered 2622 sequences on the highest scoring CRX motif. For 677 sequences without a CRX motif, we instead centered them using the Gibbs sampler from ShapeMF (Github commit abe8421) [@bib73] and a motif size of 10.\n", "\n", "For sequences unbound in CRX ChIP-seq but in open chromatin, we took ATAC-seq peaks collected in 8-week FACS-purified rods, green cones, and _Nrl^-/-^_ blue cones [@bib31] and removed sequences that overlapped with CRX ChIP-seq peaks. The remaining sequences were scanned for instances of CRX motifs using FIMO with a p-value cutoff of 2.5 × 10^–3^ and the CRX PWM. Sequences with a CRX motif were kept and the three ATAC-seq data sets were merged together, intersected with H3K27ac and H3K4me3 ChIP-seq peaks collected in P14 retinas [@bib72], and centered on the highest scoring CRX motifs. We randomly selected 1004 H3K27ac^+^H3K4me3^-^ sequences and 541 H3K27ac^+^H3K4me3^+^ to reflect the fact that ~35% of CRX ChIP-seq peaks are H3K4me3^+^. After synthesis of our library, we discovered 11% of these sequences do not actually overlap H3K27ac ChIP-seq peaks (110/1004 of the H3K4me3^-^ group and 60/541 of the H3K4me3^+^ group), but we still included them in the analysis because they contain CRX motifs in ATAC-seq peaks.\n", "\n", "All data was converted to mm10 coordinates using the UCSC liftOver tool [@bib22] and processed using Bedtools version 2.27.1 [@bib68]. All sequences in our library design were adjusted to 164 bp and screened for instances of EcoRI, SpeI, SphI, and NotI sites. In total, our library contains 4844 genomic sequences (2622 CRX ChIP-seq peaks with motifs, 677 CRX ChIP-seq peaks without motifs, 1004 CRX^-^ATAC^+^H3K27ac^+^H3K4me3^-^ CRX motifs, and 541 CRX^-^ATAC^+^H3K27ac^+^H3K4me3^+^ CRX motifs), a variant of each sequence with all CRX motifs mutated, 150 scrambled sequences, and a construct for cloning the basal promoter alone.\n", "\n", "For sequences centered on CRX motifs, all CRX motifs with a p-value of 2.5 × 10^–3^ or less were mutated by changing the core TAAT to TACT [@bib49] on the appropriate strand, as described previously [@bib32; @bib85]. We then re-scanned sequences and mutated any additional motifs inadvertently created.\n", "\n", "To generate scrambled sequences, we randomly selected 150 CRX ChIP-seq peaks spanning the entire range of GC content in the library. We then scrambled each sequence while preserving dinucleotide content as previously described [@bib85]. We used FIMO to confirm that none of the scrambled sequences contain CRX motifs.\n", "\n", "We unintentionally used a FIMO p-value cutoff of 2.3 × 10^–3^ to identify CRX motifs in CRX ChIP-seq peaks, rather than the slightly less stringent 2.5 × 10^–3^ cutoff used with ATAC-seq peaks or mutating CRX motifs. Due to this anomaly, there may be sequences centered using ShapeMF that should have been centered on a CRX motif, and these motifs would not have been mutated because CRX motifs were not mutated in sequences centered using ShapeMF. However, any intact CRX motifs would still be captured in the residual information content of the mutant sequence.\n", "\n", "## Plasmid library construction\n", "\n", "We generated two 15,000 libraries of 230 bp oligonucleotides (oligos) from Agilent Technologies (Santa Clara, CA) through a limited licensing agreement. Our library was split across the two oligo pools, ensuring that both the genomic and mutant forms of each sequence were placed in the same oligo pool ([Supplementary files 1 and 2](#supp1)). Both oligo pools contain all 150 scrambled sequences as an internal control. All sequences were assigned three unique barcodes as previously described [@bib85]. In each oligo pool, the basal promoter alone was assigned 18 unique barcodes. Oligos were synthesized as follows: 5’ priming sequence (GTAGCGTCTGTCCGT)/EcoRI site/Library sequence/SpeI site/C/SphI site/Barcode sequence/NotI site/3’ priming sequence (CAACTACTACTACAG). To clone the basal promoter into barcoded oligos without any upstream _cis_-regulatory sequence, we placed the SpeI site next to the EcoRI site, which allowed us to place the promoter between the EcoRI site and the 3’ barcode.\n", "\n", "We cloned the synthesized oligos as previously described by our group [@bib47; @bib86; @bib85]. Specifically, for each oligo pool, we used 50 femtomoles of template and four cycles of PCR in each of multiple 50 µl reactions (New England Biolabs \\[NEB], Ipswich, MA) (NEB Phusion) using primers MO563 and MO564 ([Supplementary file 6](#supp6)), 2% DMSO, and an annealing temperature of 57°C. PCR amplicons were purified from a 2% agarose gel (NEB), digested with EcoRI-HF and NotI-HF (NEB), and then cloned into the EagI and EcoRI sites of the plasmid pJK03 with multiple 20 µl ligation reactions (NEB T4 ligase). The libraries were transformed into 5-alpha electrocompetent cells (NEB) and grown in liquid culture. Next, 2 µg of each library was digested with SphI-HF and SpeI-HF (NEB) and then treated with Antarctic phosphatase (NEB).\n", "\n", "The _Rho_ basal promoter and _DsRed_ reporter gene was amplified from the plasmid pJK01 using primers MO566 and MO567 ([Supplementary file 6](#supp6)). The Polylinker and _DsRed_ reporter gene was amplified from the plasmid pJK03 using primers MO610 and MO567 ([Supplementary file 6](#supp6)). The Polylinker is a short 23 bp multiple cloning site with no known core promoter motifs. Inserts were purified from a 1% agarose gel (NEB), digested with NheI-HF and SphI-HF (NEB), and cloned into the libraries using multiple 20 µl ligations (NEB T4 ligase). The libraries were transformed into 5-alpha electrocompetent cells (NEB) and grown in liquid culture.\n", "\n", "## Retinal explant electroporation\n", "\n", "Animal procedures were performed in accordance with a Washington University in St Louis Institutional Animal Care and Use Committee-approved vertebrate animals protocol. Electroporation into retinal explants and RNA extraction was performed as described previously [@bib28; @bib32; @bib47; @bib86; @bib85]. Briefly, retinas were isolated from P0 newborn CD-1 mice and electroporated in a solution with 30 µg library and 30 µg _Rho_-GFP. Electroporated retinas were cultured for 8 days, at which point they were harvested, washed three times with HBSS (ThermoFisher Scientific/Gibco, Waltham, MA), and stored in TRIzol (ThermoFisher Scientific/Invitrogen, Waltham, MA) at –80°C. Five retinas were pooled for each biological replicate and three replicates were performed for each library. RNA was extracted from TRIzol according to manufacturer’s instructions and treated with TURBO DNase (Invitrogen). cDNA was prepared using SuperScript RT III (Invitrogen) with oligo dT primers. Barcodes from both the cDNA and the plasmid DNA pool were amplified for sequencing (described below). The resulting products were mixed at equal concentration and sequenced on the Illumina NextSeq platform. We obtained greater than 1300× coverage across all samples.\n", "\n", "_Rho_ libraries were amplified using primers MO574 and MO575 ([Supplementary file 6](#supp6)) for six cycles at an annealing temperature of 66°C followed by 18 cycles with no annealing step (NEB Phusion) and then purified with the Monarch PCR kit (NEB). PCR amplicons were digested using MfeI-HF and SphI-HF (NEB) and ligated to custom-annealed adaptors with PE2 indexing barcodes and phased P1 barcodes ([Supplementary file 6](#supp6)). The final enrichment PCR used primers MO588 and MO589 ([Supplementary file 6](#supp6)) for 20 cycles at an annealing temperature of 66°C (NEB Phusion), followed by purification with the Monarch PCR kit. Polylinker libraries were amplified using primers BC_CRX_Nested_F and BC_CRX_R ([Supplementary file 6](#supp6)) for 30 cycles (NEB Q5) at an annealing temperature of 67°C and then purified with the Monarch PCR kit. Illumina adaptors were then added via two further rounds of PCR. First, P1 indexing barcodes were added using forward primers P1_inner_A through P1_inner_D and reverse primer P1_inner_nested_rev ([Supplementary file 6](#supp6)) for five cycles at an annealing temperature of 55°C followed by five cycles with no annealing step (NEB Q5). PE2 indexing barcodes were then added by amplifying 2 µl of the previous reaction with forward primer P1_outer and reverse primers PE2_outer_SIC69 and PE2_outer_SIC70 ([Supplementary file 6](#supp6)) for five cycles at an annealing temperature of 66°C followed by five cycles with no annealing step (NEB Q5) and then purified with the Monarch PCR kit.\n", "\n", "## Data processing\n", "\n", "All data processing, statistical analysis, and downstream analyses were performed in Python version 3.6.5 using Numpy version 1.15.4 [@bib24], Scipy version 1.1.0 [@bib82], and Pandas version 0.23.4 [@bib54], and visualized using Matplotlib version 3.0.2 [@bib33] and Logomaker version 0.8 [@bib81]. All statistical analysis used two-sided tests unless noted otherwise.\n", "\n", "Sequencing reads were filtered to ensure that the barcode sequence perfectly matched the expected sequence (>93% reads in a sample for the _Rho_ libraries, >86% reads for the Polylinker libraries). For the _Rho_ libraries, barcodes that had less than 10 raw counts in the DNA sample were considered missing and removed from downstream analysis. Barcodes that had less than five raw counts in any cDNA sample were considered present in the input plasmid pool but below the detection limit and thus set to zero in all samples. Barcode counts were normalized by reads per million (RPM) for each sample. Barcode expression was calculated by dividing the cDNA RPM by the DNA RPM. Replicate-specific expression was calculated by averaging the barcodes corresponding to each library sequence. After performing statistical analysis (see below), expression levels were normalized by replicate-specific basal mean expression and then averaged across biological replicates.\n", "\n", "For the Polylinker assay, the expected lack of expression of many constructs required different processing. Barcodes that had less than 50 raw counts in the DNA sample were removed from downstream analysis. Barcodes were normalized by RPM for each replicate. Barcodes that had less than 8 RPM in any cDNA sample were set to zero in all samples. cDNA RPM were then divided by DNA RPM as above. Within each biological replicate, barcodes were averaged as above but were not normalized to basal expression because there is no basal construct. Expression values were then averaged across biological replicates. Due to the low expression of scrambled sequences and the lack of a basal construct, we were unable to assess data calibration with the same rigor as above.\n", "\n", "## Assignment of activity classes\n", "\n", "Activity classes were assigned by comparing expression levels to basal promoter expression levels across replicates. The null hypothesis is that the expression of a sequence is the same as basal levels. Expression levels were approximately log-normally distributed, so we computed the log-normal parameters for each sequence and then performed Welch’s t-test. We corrected for multiple hypotheses using the Benjamini-Hochberg FDR procedure. We corrected for multiple hypotheses in each library separately to account for any potential batch effects between libraries. The log~2~ expression was calculated after adding a pseudocount of 1 × 10^–3^ to every sequence.\n", "\n", "Sequences were classified as enhancers if they were twofold above basal and the q-value was below 0.05. Silencers were similarly defined as twofold below basal and q-value less than 0.05. Inactive sequences were defined as within a twofold change and q-value greater than or equal to 0.05. All other sequences were classified as ambiguous and removed from further analysis. We used scrambled sequences to further stratify enhancers into strong and weak enhancers, using the rationale that scrambled sequences give an empirical distribution for the activity of random sequences. We defined strong enhancers as enhancers that are above the 95th percentile of scrambled sequences and all other enhancers as weak enhancers.\n", "\n", "For the Polylinker assay, we did not have a basal construct as a reference point. Instead, we defined a sequence to have autonomous activity if the average cDNA barcode counts were higher than average DNA barcode counts, and non-autonomous otherwise. The log~2~ expression was calculated after adding a pseudocount of 1 × 10^–2^ to every sequence.\n", "\n", "## RNA-seq analysis\n", "\n", "We obtained RNA-seq data from WT and Crx^-/-^ P21 retinas [@bib71] processed into a counts matrix for each gene by [@bib72]. Each sample was normalized by read counts per million and replicates were averaged together. Genes with at least a twofold change between genotypes were considered differentially expressed. We determined which differentially expressed genes are near a member of our library using previously published associations between retinal ATAC-seq peaks and genes [@bib60]. For de-repressed genes, we determined how often the nearest library member is a silencer; for down-regulated genes, we determined how often the nearest library member is a strong or weak enhancer.\n", "\n", "## Motif analysis\n", "\n", "We performed motif enrichment analysis using the MEME Suite version 5.0.4 [@bib3]. We searched for motifs that were enriched in one group of sequences relative to another group using DREME-py3 with the parameters -mink 6 -maxk 12 -e 0.05 and compared the de novo motifs to known motifs using TOMTOM on default parameters. We ran DREME using strong enhancers as positives and silencers as negatives, and vice versa. For TOMTOM, we used version 11 of the full mouse HOCOMOCO database [@bib46] with the following additions from the JASPAR human database [@bib42]: NRL (accession MA0842.1), RORB (accession MA1150.1), and RAX (accession MA0718.1). We added these PWMs because they have known roles in the retina, but the mouse PWMs were not in the HOCOMOCO database. We also used the CRX PWM that we used to design the library. Motifs were selected for downstream analysis based on their matches to the de novo motifs, whether the TF had a known role in retinal development, and the quality of the PWM. Because PWMs from TFs of the same family were so similar, we used one TF for each DREME motif, recognizing that these motifs may be bound by other TFs that recognize similar motifs. We did not use any PWMs with a quality of ‘D’. We excluded DREME motifs without a match to the database from further analysis; most of these resemble dinucleotides.\n", "\n", "## Predicted occupancy\n", "\n", "We computed predicted occupancy as previously described [@bib85; @bib89]. Briefly, we normalized each letter probability matrix by the most probable letter at each position. We took the negative log of this matrix and multiplied by 2.5, which corresponds to the ideal gas constant times 300 K, to obtain an energy weight matrix. We used a chemical potential _μ_ of 9 for all TFs. At this value, the probability of a site being bound is at least 0.5 if the relative _K_~D~ is at least 0.03 of the optimal binding site. We computed the predicted occupancy for every site on the forward and reverse strands and summed them together to get a single value for each TF.\n", "\n", "To determine if there is a bias in the linear arrangement of motifs, we selected strong enhancers with exactly one site occupied by CRX and exactly one site occupied by a second TF. We counted the number of times the position of the second TF was 5’ and 3’ of the CRX site and then performed a binomial test. We did not observe a statistically significant bias for any TF at an FDR q-value cutoff of 0.05. We also performed this analysis for silencers with exactly one site occupied by CRX and exactly one site occupied by NRL and did not observe a significant difference in the 5’ vs. 3’ bias of strong enhancers vs. silencers (Fisher’s exact test p = 0.17).\n", "\n", "## Information content\n", "\n", "To capture the effects of TF predicted occupancy and diversity in a single metric, we calculated the motif information content using Boltzmann entropy. Boltzmann’s equation states that the entropy of a system ${\\displaystyle S}$ is related to the number of ways the molecules can be arranged (microstates) ${\\displaystyle W}$ via the equation ${\\displaystyle S={k}_{B}\\mathrm{log}W}$, where ${\\displaystyle {k}_{B}}$ is Boltzmann’s constant ([@bib67], Chapter 5). The number of microstates is defined as ${\\displaystyle W=\\frac{N!}{\\prod _{i}{N}_{i}!}}$ where ${\\displaystyle N}$ is the total number of particles and ${\\displaystyle {N}_{i}}$ are the number of the -th type of particles. In our case, the system is the collection of predicted binding motifs for different TFs in a _cis_-regulatory sequence. We assume each TF is a different type of molecule because the DNA-binding domain of each TF belongs to a different subfamily. The number of molecular arrangements ${\\displaystyle W}$ represents the number of distinguishable ways that the TFs can be ordered on the sequence. Thus, ${\\displaystyle {N}_{i}}$ is the predicted occupancy of the ${\\displaystyle i}$-th TF and ${\\displaystyle N}$ is the total predicted occupancy of all TFs on the _cis_-regulatory sequence. Because the predicted occupancies are continuous values, we exploit the definition of the Gamma function, ${\\displaystyle \\mathrm{\\Gamma }(N+1)=N!}$ to rewrite ${\\displaystyle W=\\frac{\\mathrm{\\Gamma }(N+1)}{\\prod _{i}\\mathrm{\\Gamma }({N}_{i}+1)}}$ .\n", "\n", "If we assume that each arrangement of motifs is equally likely, then we can write the probability of arrangement ${\\displaystyle w=1,\\dots ,W}$ as ${\\displaystyle {p}_{w}=\\frac{1}{w}}$ and rewrite the entropy as ${\\displaystyle S=-\\mathrm{log}(\\frac{1}{w})=-\\mathrm{log}({p}_{w})}$, where we have dropped Boltzmann’s constant since the connection between molecular arrangements and temperature is not important. Because each arrangement is equally likely, then ${\\displaystyle \\frac{1}{w}}$ is also the expected value of ${\\displaystyle {p}_{w}}$ and we can write the entropy as ${\\displaystyle S=-E[\\mathrm{log}({p}_{w})]=-\\sum _{w}{p}_{w}\\mathrm{log}({p}_{w})}$ , which is Shannon entropy. By definition, Shannon entropy is also the expected value of the information content: ${\\displaystyle E[I]=-\\sum _{w}{p}_{w}\\mathrm{log}({p}_{w})=\\sum _{w}{p}_{w}I(w)}$ where the information content ${\\displaystyle I}$ of a particular state is ${\\displaystyle I(w)=\\mathrm{log}({p}_{w})}$. Since we assumed each arrangement is equally likely, then the expected value of the information content is also the information content of each arrangement. Therefore, the information content of a _cis_-regulatory sequence can be written as ${\\displaystyle I=-\\mathrm{l}\\mathrm{o}\\mathrm{g}({p}_{w})=\\mathrm{l}\\mathrm{o}\\mathrm{g}W}$. We use log base 2 to express the information content in bits.\n", "\n", "With this metric, _cis_-regulatory sequences with higher predicted TF occupancies generally have higher information content. Sequences with higher TF diversity have higher information content than lower diversity sequences with the same predicted occupancy. Thus, our metric captures the effects of both TF diversity and total TF occupancy. For example, consider hypothetical TFs A, B, and C. If motifs for only one TF are in a sequence, then ${\\displaystyle W}$ is always one and the information content is always zero (regardless of total occupancy). The simplest case for non-zero information content is one motif for A, one motif for B, and zero motifs for C (1-1-0). Then ${\\displaystyle W=\\frac{2!}{1!1!}=2}$ and ${\\displaystyle I=1}$ bit. If we increase predicted occupancy by adding a motif for A (2-1-0), then ${\\displaystyle W=\\frac{3!}{2!1!}=3}$ and ${\\displaystyle I=1.6}$ bits, which is approximately the information content of silencers and inactive sequences. If we increase predicted occupancy again and add a second motif for B (2-2-0), then ${\\displaystyle W=\\frac{4!}{2!2!}=6}$ and ${\\displaystyle I=2.6}$ bits, which is approximately the information content of strong enhancers. If instead of increasing predicted occupancy, we instead increase diversity by replacing a motif for A with a motif for C (1-1-1), then ${\\displaystyle W=\\frac{3!}{1!1!1!}=6}$ and once again ${\\displaystyle I=2.6}$ bits, which is higher than the lower diversity case (2-1-0).\n", "\n", "According to [@bib87], the probability of observing ${\\displaystyle k}$ total motifs for ${\\displaystyle m}$ different TFs in a ${\\displaystyle w}$ bp window is ${\\displaystyle p(k)\\sim (Poisson(k;\\lambda ))}$[,](https://www.codecogs.com/eqnedit.php?latex=P(k)%20sim%20Poisson(k%3B%20lambda)#0) where ${\\displaystyle \\lambda =pmw}$ and ${\\displaystyle p}$ is the probability of finding a spurious motif in the genome. The expected number of windows with ${\\displaystyle k}$ total motifs in a genome of length [N](https://www.codecogs.com/eqnedit.php?latex=N#0) is thus ${\\displaystyle E(k)=p(k)\\cdot N}$. In mammals, ${\\displaystyle N\\approx {10}^{9}}$ and Wunderlich and Mirny find that ${\\displaystyle p=0.0025}$ for multicellular eukaryotes. For ${\\displaystyle m=3}$ TFs and a [w=164](https://www.codecogs.com/eqnedit.php?latex=w%20%3D%20164#0) bp window (which is the size of our sequences), ${\\displaystyle \\lambda =0.123}$ and ${\\displaystyle E(5)=1.6}$ meaning that five total motifs for three different TFs specify an approximately unique 164 bp location in a mammalian genome. Five total motifs for three different TFs can be achieved in two ways: three motifs for A, one for B, and one for C (3-1-1), or two motifs for A, two for B, and one for C (2-2-1). In the case of 3-1-1, ${\\displaystyle W=\\frac{5!}{3!1!1!}=20}$ and ${\\displaystyle I=4.3}$ bits. In the case of 2-2-1, ${\\displaystyle W=\\frac{5!}{2!2!1!}=30}$ and ${\\displaystyle I=4.9}$ bits.\n", "\n", "## Machine learning\n", "\n", "The _k_-mer SVM was fit using gkmSVM [@bib19]. All other machine learning, including cross-validation, logistic regression, and computing ROC and PR curves, was performed using scikit-learn version 0.19.1 [@bib64]. We wrote custom Python wrappers for gkmSVM to allow for interfacing between the C++ binaries and the rest of our workflow. We ran gkmSVM with the parameters -l 6 -k 6 -m 1. To estimate model performance, all models were fit with stratified fivefold cross-validation after shuffling the order of sequences. For the TF occupancy logistic regression model, we used L2 regularization. We selected the regularization parameter C by performing grid search with fivefold cross-validation on the values 10^–4^, 10^–3^, 10^–2^, 10^–1^, 1, 10^1^, 10^2^, 10^3^, 10^4^ and selecting the value that maximized the F1 score. The optimal value of C was 0.01, which we used as the regularization strength when assessing the performance of the model with other feature sets.\n", "\n", "To assess the performance of the logistic regression model, we randomly sampled eight PWMs from the HOCOMOCO database and computed the predicted occupancy of each TF on each sequence. We then fit a new logistic regression model with these features and repeated this procedure 100 times to generate a background distribution of model performances.\n", "\n", "To generate de novo motifs from the SVM, we generated all 6-mers and scored them against the SVM. We then ran the svmw_emalign.py script from gkmSVM on the _k_-mer scores with the parameters -n 10 -f 2 -m 4 and a PWM length of 6, and then used TOMTOM to compare them to the database from our motif analysis.\n", "\n", "## Other data sources\n", "\n", "We used our previously published library [@bib85] as an independent test set for our machine learning models. We defined strong enhancers as ChIP-seq peaks that were above the 95th percentile of all scrambled sequences. There was no basal promoter construct in this library, so instead we defined silencers as ChIP-seq peaks that were at least twofold below the log~2~ mean of all scrambled sequences.\n", "\n", "Previously published ChIP-seq data for NRL [@bib23] that was re-processed by [@bib31] and MEF2D [@bib2] was used to annotate sequences for in vivo TF binding. We converted peaks to mm10 coordinates using the UCSC liftOver tool and then used Bedtools to intersect peaks with our library." ] } ], "metadata": { "about": [ { "name": "Computational and Systems Biology", "type": "DefinedTerm" }, { "name": "Genetics and Genomics", "type": "DefinedTerm" } ], "authors": [ { "affiliations": [ { "address": { "addressCountry": "United States", "addressLocality": "St. Louis", "type": "PostalAddress" }, "name": "Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine", "type": "Organization" }, { "address": { "addressCountry": "United States", "addressLocality": "St. Louis", "type": "PostalAddress" }, "name": "Department of Genetics, Washington University School of Medicine", "type": "Organization" } ], "familyNames": [ "Friedman" ], "givenNames": [ "Ryan", "Z" ], "name": "Ryan Z Friedman", "type": "Person" }, { "affiliations": [ { "address": { "addressCountry": "United States", "addressLocality": "St. Louis", "type": "PostalAddress" }, "name": "Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine", "type": "Organization" }, { "address": { "addressCountry": "United States", "addressLocality": "St. Louis", "type": "PostalAddress" }, "name": "Department of Genetics, Washington University School of Medicine", "type": "Organization" } ], "familyNames": [ "Granas" ], "givenNames": [ "David", "M" ], "name": "David M Granas", "type": "Person" }, { "affiliations": [ { "address": { "addressCountry": "United States", "addressLocality": "St Louis", "type": "PostalAddress" }, "name": "Department of Pathology and Immunology, Washington University School of Medicine", "type": "Organization" } ], "familyNames": [ "Myers" ], "givenNames": [ "Connie", "A" ], "name": "Connie A Myers", "type": "Person" }, { "affiliations": [ { "address": { "addressCountry": "United States", "addressLocality": "St Louis", "type": "PostalAddress" }, "name": "Department of Pathology and Immunology, Washington University School of Medicine", "type": "Organization" } ], "familyNames": [ "Corbo" ], "givenNames": [ "Joseph", "C" ], "name": "Joseph C Corbo", "type": "Person" }, { "affiliations": [ { "address": { "addressCountry": "United States", "addressLocality": "St. Louis", "type": "PostalAddress" }, "name": "Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine", "type": "Organization" }, { "address": { "addressCountry": "United States", "addressLocality": "St. Louis", "type": "PostalAddress" }, "name": "Department of Genetics, Washington University School of Medicine", "type": "Organization" } ], "familyNames": [ "Cohen" ], "givenNames": [ "Barak", "A" ], "name": "Barak A Cohen", "type": "Person" }, { "affiliations": [ { "address": { "addressCountry": "United States", "addressLocality": "St. Louis", "type": "PostalAddress" }, "name": "Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine", "type": "Organization" }, { "address": { "addressCountry": "United States", "addressLocality": "St. Louis", "type": "PostalAddress" }, "name": "Department of Genetics, Washington University School of Medicine", "type": "Organization" } ], "emails": [ "mawhite@wustl.edu" ], "familyNames": [ "White" ], "givenNames": [ "Michael", "A" ], "name": "Michael A White", "type": "Person" } ], "dateAccepted": { "type": "Date", "value": "2021-09-03" }, "datePublished": { "type": "Date", "value": "2021-09-06" }, "dateReceived": { "type": "Date", "value": "2021-02-09" }, "description": [ "Enhancers and silencers often depend on the same transcription factors (TFs) and are conflated in genomic assays of TF binding or chromatin state. To identify sequence features that distinguish enhancers and silencers, we assayed massively parallel reporter libraries of genomic sequences targeted by the photoreceptor TF cone-rod homeobox (CRX) in mouse retinas. Both enhancers and silencers contain more TF motifs than inactive sequences, but relative to silencers, enhancers contain motifs from a more diverse collection of TFs. We developed a measure of information content that describes the number and diversity of motifs in a sequence and found that, while both enhancers and silencers depend on CRX motifs, enhancers have higher information content. The ability of information content to distinguish enhancers and silencers targeted by the same TF illustrates how motif context determines the activity of ", { "content": [ "cis" ], "type": "Emphasis" }, "-regulatory sequences." ], "editors": [ { "affiliations": [ { "address": { "addressCountry": "Israel", "type": "PostalAddress" }, "name": "Weizmann Institute of Science", "type": "Organization" } ], "familyNames": [ "Barkai" ], "givenNames": [ "Naama" ], "type": "Person" } ], "fundedBy": [ { "funders": [ { "name": "National Institutes of Health", "type": "Organization" } ], "identifiers": [ { "type": "PropertyValue", "value": "F31HG011431" } ], "type": "MonetaryGrant" }, { "funders": [ { "name": "National Institutes of Health", "type": "Organization" } ], "identifiers": [ { "type": "PropertyValue", "value": "R01GM121755" } ], "type": "MonetaryGrant" }, { "funders": [ { "name": "National Institutes of Health", "type": "Organization" } ], "identifiers": [ { "type": "PropertyValue", "value": "R01EY027784" } ], "type": "MonetaryGrant" }, { "funders": [ { "name": "National Institutes of Health", "type": "Organization" } ], "identifiers": [ { "type": "PropertyValue", "value": "R01EY025196" } ], "type": "MonetaryGrant" }, { "funders": [ { "name": "National Institutes of Health", "type": "Organization" } ], "identifiers": [ { "type": "PropertyValue", "value": "R01EY030075" } ], "type": "MonetaryGrant" } ], "genre": [ "Research Article" ], "identifiers": [ { "name": "publisher-id", "propertyID": "https://registry.identifiers.org/registry/publisher-id", "type": "PropertyValue", "value": 67403 }, { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.7554/eLife.67403" }, { "name": "elocation-id", "propertyID": "https://registry.identifiers.org/registry/elocation-id", "type": "PropertyValue", "value": "e67403" } ], "isPartOf": { "isPartOf": { "identifiers": [ { "name": "nlm-ta", "propertyID": "https://registry.identifiers.org/registry/nlm-ta", "type": "PropertyValue", "value": "elife" }, { "name": "publisher-id", "propertyID": "https://registry.identifiers.org/registry/publisher-id", "type": "PropertyValue", "value": "eLife" } ], "issns": [ "2050-084X" ], "publisher": { "name": "eLife Sciences Publications, Ltd", "type": "Organization" }, "title": "eLife", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 10 }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "keywords": [ "enhancers", "silencers", "information theory", "massively parallel reporter assays", "Mouse" ], "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" }, "licenses": [ { "content": [ { "content": [ "This article is distributed under the terms of the ", { "content": [ "Creative Commons Attribution License" ], "target": "http://creativecommons.org/licenses/by/4.0/", "type": "Link" }, ", which permits unrestricted use and redistribution provided that the original author and source are credited." ], "type": "Paragraph" } ], "type": "CreativeWork", "url": "http://creativecommons.org/licenses/by/4.0/" } ], "references": [ { "authors": [ { "familyNames": [ "Alexandre" ], "givenNames": [ "C" ], "type": "Person" }, { "familyNames": [ "Vincent" ], "givenNames": [ "JP" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2003" }, "id": "bib1", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1242/dev.00286" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 12506003 } ], "isPartOf": { "isPartOf": { "name": "Development", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 130 }, "pageEnd": 739, "pageStart": 729, "title": "Requirements for transcriptional repression and activation by engrailed in Drosophila embryos", "type": "Article" }, { "authors": [ { "familyNames": [ "Andzelm" ], "givenNames": [ "MM" ], "type": "Person" }, { "familyNames": [ "Cherry" ], "givenNames": [ "TJ" ], "type": "Person" }, { "familyNames": [ "Harmin" ], "givenNames": [ "DA" ], "type": "Person" }, { "familyNames": [ "Boeke" ], "givenNames": [ "AC" ], "type": "Person" }, { "familyNames": [ "Lee" ], "givenNames": [ "C" ], "type": "Person" }, { "familyNames": [ "Hemberg" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Pawlyk" ], "givenNames": [ "B" ], "type": "Person" }, { "familyNames": [ "Malik" ], "givenNames": [ "AN" ], "type": "Person" }, { "familyNames": [ "Flavell" ], "givenNames": [ "SW" ], "type": "Person" }, { "familyNames": [ "Sandberg" ], "givenNames": [ "MA" ], "type": "Person" }, { "familyNames": [ "Raviola" ], "givenNames": [ "E" ], "type": "Person" }, { "familyNames": [ "Greenberg" ], "givenNames": [ "ME" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2015" }, "id": "bib2", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1016/j.neuron.2015.02.038" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 25801704 } ], "isPartOf": { "isPartOf": { "name": "Neuron", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 86 }, "pageEnd": 263, "pageStart": 247, "title": "MEF2D drives photoreceptor development through a genome-wide competition for tissue-specific enhancers", "type": "Article" }, { "authors": [ { "familyNames": [ "Bailey" ], "givenNames": [ "TL" ], "type": "Person" }, { "familyNames": [ "Boden" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Buske" ], "givenNames": [ "FA" ], "type": "Person" }, { "familyNames": [ "Frith" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Grant" ], "givenNames": [ "CE" ], "type": "Person" }, { "familyNames": [ "Clementi" ], "givenNames": [ "L" ], "type": "Person" }, { "familyNames": [ "Ren" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Li" ], "givenNames": [ "WW" ], "type": "Person" }, { "familyNames": [ "Noble" ], "givenNames": [ "WS" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2009" }, "id": "bib3", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1093/nar/gkp335" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 19458158 } ], "isPartOf": { "isPartOf": { "name": "Nucleic Acids Research", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 37 }, "title": "MEME SUITE: Tools for motif discovery and searching", "type": "Article" }, { "authors": [ { "familyNames": [ "Barolo" ], "givenNames": [ "S" ], "type": "Person" }, { "familyNames": [ "Posakony" ], "givenNames": [ "JW" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2002" }, "id": "bib4", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1101/gad.976502" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 12023297 } ], "isPartOf": { "isPartOf": { "name": "Genes & Development", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 16 }, "pageEnd": 1181, "pageStart": 1167, "title": "Three habits of highly effective signaling pathways: Principles of transcriptional control by developmental cell signaling", "type": "Article" }, { "authors": [ { "familyNames": [ "Brand" ], "givenNames": [ "AH" ], "type": "Person" }, { "familyNames": [ "Micklem" ], "givenNames": [ "G" ], "type": "Person" }, { "familyNames": [ "Nasmyth" ], "givenNames": [ "K" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "1987" }, "id": "bib5", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1016/0092-8674(87)90094-8" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 3315230 } ], "isPartOf": { "isPartOf": { "name": "Cell", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 51 }, "pageEnd": 719, "pageStart": 709, "title": "A yeast silencer contains sequences that can promote autonomous plasmid replication and transcriptional activation", "type": "Article" }, { "authors": [ { "familyNames": [ "Chen" ], "givenNames": [ "S" ], "type": "Person" }, { "familyNames": [ "Wang" ], "givenNames": [ "QL" ], "type": "Person" }, { "familyNames": [ "Nie" ], "givenNames": [ "Z" ], "type": "Person" }, { "familyNames": [ "Sun" ], "givenNames": [ "H" ], "type": "Person" }, { "familyNames": [ "Lennon" ], "givenNames": [ "G" ], "type": "Person" }, { "familyNames": [ "Copeland" ], "givenNames": [ "NG" ], "type": "Person" }, { "familyNames": [ "Gilbert" ], "givenNames": [ "DJ" ], "type": "Person" }, { "familyNames": [ "Jenkins" ], "givenNames": [ "NA" ], "type": "Person" }, { "familyNames": [ "Zack" ], "givenNames": [ "DJ" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "1997" }, "id": "bib6", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1016/s0896-6273(00)80394-3" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 9390516 } ], "isPartOf": { "isPartOf": { "name": "Neuron", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 19 }, "pageEnd": 1030, "pageStart": 1017, "title": "Crx, a novel Otx-like paired-homeodomain protein, binds to and transactivates photoreceptor cell-specific genes", "type": "Article" }, { "authors": [ { "familyNames": [ "Chen" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Rattner" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Nathans" ], "givenNames": [ "J" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2005" }, "id": "bib7", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1523/JNEUROSCI.3571-04.2005" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 15634773 } ], "isPartOf": { "isPartOf": { "name": "The Journal of Neuroscience", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 25 }, "pageEnd": 129, "pageStart": 118, "title": "The rod photoreceptor-specific nuclear receptor Nr2e3 represses transcription of multiple cone-specific genes", "type": "Article" }, { "authors": [ { "familyNames": [ "Chiang" ], "givenNames": [ "C" ], "type": "Person" }, { "familyNames": [ "Ayyanathan" ], "givenNames": [ "K" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2013" }, "id": "bib8", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1016/j.cytogfr.2012.09.002" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 23102646 } ], "isPartOf": { "isPartOf": { "name": "Cytokine & Growth Factor Reviews", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 24 }, "pageEnd": 131, "pageStart": 123, "title": "SNAIL/GFI-1 (SNAG) family zinc finger proteins in transcription regulation, chromatin dynamics, cell signaling, development, and disease", "type": "Article" }, { "authors": [ { "familyNames": [ "Corbo" ], "givenNames": [ "JC" ], "type": "Person" }, { "familyNames": [ "Lawrence" ], "givenNames": [ "KA" ], "type": "Person" }, { "familyNames": [ "Karlstetter" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Myers" ], "givenNames": [ "CA" ], "type": "Person" }, { "familyNames": [ "Abdelaziz" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Dirkes" ], "givenNames": [ "W" ], "type": "Person" }, { "familyNames": [ "Weigelt" ], "givenNames": [ "K" ], "type": "Person" }, { "familyNames": [ "Seifert" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Benes" ], "givenNames": [ "V" ], "type": "Person" }, { "familyNames": [ "Fritsche" ], "givenNames": [ "LG" ], "type": "Person" }, { "familyNames": [ "Weber" ], "givenNames": [ "BHF" ], "type": "Person" }, { "familyNames": [ "Langmann" ], "givenNames": [ "T" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2010" }, "id": "bib9", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1101/gr.109405.110" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 20693478 } ], "isPartOf": { "isPartOf": { "name": "Genome Research", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 20 }, "pageEnd": 1525, "pageStart": 1512, "title": "CRX ChIP-seq reveals the cis-regulatory architecture of mouse photoreceptors", "type": "Article" }, { "authors": [ { "familyNames": [ "Crocker" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Abe" ], "givenNames": [ "N" ], "type": "Person" }, { "familyNames": [ "Rinaldi" ], "givenNames": [ "L" ], "type": "Person" }, { "familyNames": [ "McGregor" ], "givenNames": [ "AP" ], "type": "Person" }, { "familyNames": [ "Frankel" ], "givenNames": [ "N" ], "type": "Person" }, { "familyNames": [ "Wang" ], "givenNames": [ "S" ], "type": "Person" }, { "familyNames": [ "Alsawadi" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Valenti" ], "givenNames": [ "P" ], "type": "Person" }, { "familyNames": [ "Plaza" ], "givenNames": [ "S" ], "type": "Person" }, { "familyNames": [ "Payre" ], "givenNames": [ "F" ], "type": "Person" }, { "familyNames": [ "Mann" ], "givenNames": [ "RS" ], "type": "Person" }, { "familyNames": [ "Stern" ], "givenNames": [ "DL" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2015" }, "id": "bib10", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1016/j.cell.2014.11.041" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 25557079 } ], "isPartOf": { "isPartOf": { "name": "Cell", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 160 }, "pageEnd": 203, "pageStart": 191, "title": "Low affinity binding site clusters confer hox specificity and regulatory robustness", "type": "Article" }, { "authors": [ { "familyNames": [ "Doni", "Jayavelu" ], "givenNames": [ "N" ], "type": "Person" }, { "familyNames": [ "Jajodia" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Mishra" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Hawkins" ], "givenNames": [ "RD" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2020" }, "id": "bib11", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1038/s41467-020-14853-5" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 32103011 } ], "isPartOf": { "isPartOf": { "name": "Nature Communications", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 11 }, "title": "Candidate silencer elements for the human and mouse genomes", "type": "Article" }, { "authors": [ { "familyNames": [ "Dorval" ], "givenNames": [ "KM" ], "type": "Person" }, { "familyNames": [ "Bobechko" ], "givenNames": [ "BP" ], "type": "Person" }, { "familyNames": [ "Fujieda" ], "givenNames": [ "H" ], "type": "Person" }, { "familyNames": [ "Chen" ], "givenNames": [ "S" ], "type": "Person" }, { "familyNames": [ "Zack" ], "givenNames": [ "DJ" ], "type": "Person" }, { "familyNames": [ "Bremner" ], "givenNames": [ "R" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2006" }, "id": "bib12", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1074/jbc.M509470200" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 16236706 } ], "isPartOf": { "isPartOf": { "name": "The Journal of Biological Chemistry", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 281 }, "pageEnd": 751, "pageStart": 744, "title": "Chx10 targets a subset of photoreceptor genes", "type": "Article" }, { "authors": [ { "familyNames": [ "Ernst" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Kellis" ], "givenNames": [ "M" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2012" }, "id": "bib13", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1038/nmeth.1906" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 22373907 } ], "isPartOf": { "isPartOf": { "name": "Nature Methods", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 9 }, "pageEnd": 216, "pageStart": 215, "title": "ChromHMM: automating chromatin-state discovery and characterization", "type": "Article" }, { "authors": [ { "familyNames": [ "Fan" ], "givenNames": [ "R" ], "type": "Person" }, { "familyNames": [ "Toubal" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Goñi" ], "givenNames": [ "S" ], "type": "Person" }, { "familyNames": [ "Drareni" ], "givenNames": [ "K" ], "type": "Person" }, { "familyNames": [ "Huang" ], "givenNames": [ "Z" ], "type": "Person" }, { "familyNames": [ "Alzaid" ], "givenNames": [ "F" ], "type": "Person" }, { "familyNames": [ "Ballaire" ], "givenNames": [ "R" ], "type": "Person" }, { "familyNames": [ "Ancel" ], "givenNames": [ "P" ], "type": "Person" }, { "familyNames": [ "Liang" ], "givenNames": [ "N" ], "type": "Person" }, { "familyNames": [ "Damdimopoulos" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Hainault" ], "givenNames": [ "I" ], "type": "Person" }, { "familyNames": [ "Soprani" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Aron-Wisnewsky" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Foufelle" ], "givenNames": [ "F" ], "type": "Person" }, { "familyNames": [ "Lawrence" ], "givenNames": [ "T" ], "type": "Person" }, { "familyNames": [ "Gautier" ], "givenNames": [ "JF" ], "type": "Person" }, { "familyNames": [ "Venteclef" ], "givenNames": [ "N" ], "type": "Person" }, { "familyNames": [ "Treuter" ], "givenNames": [ "E" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2016" }, "id": "bib14", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1038/nm.4114" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 27270589 } ], "isPartOf": { "isPartOf": { "name": "Nature Medicine", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 22 }, "pageEnd": 791, "pageStart": 780, "title": "Loss of the co-repressor GPS2 sensitizes macrophage activation upon metabolic stress induced by obesity and type 2 diabetes", "type": "Article" }, { "authors": [ { "familyNames": [ "Farley" ], "givenNames": [ "EK" ], "type": "Person" }, { "familyNames": [ "Olson" ], "givenNames": [ "KM" ], "type": "Person" }, { "familyNames": [ "Zhang" ], "givenNames": [ "W" ], "type": "Person" }, { "familyNames": [ "Brandt" ], "givenNames": [ "AJ" ], "type": "Person" }, { "familyNames": [ "Rokhsar" ], "givenNames": [ "DS" ], "type": "Person" }, { "familyNames": [ "Levine" ], "givenNames": [ "MS" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2015" }, "id": "bib15", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1126/science.aac6948" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 26472909 } ], "isPartOf": { "isPartOf": { "name": "Science", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 350 }, "pageEnd": 328, "pageStart": 325, "title": "Suboptimization of developmental enhancers", "type": "Article" }, { "authors": [ { "familyNames": [ "Farley" ], "givenNames": [ "EK" ], "type": "Person" }, { "familyNames": [ "Olson" ], "givenNames": [ "KM" ], "type": "Person" }, { "familyNames": [ "Zhang" ], "givenNames": [ "W" ], "type": "Person" }, { "familyNames": [ "Rokhsar" ], "givenNames": [ "DS" ], "type": "Person" }, { "familyNames": [ "Levine" ], "givenNames": [ "MS" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2016" }, "id": "bib16", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1073/pnas.1605085113" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 27155014 } ], "isPartOf": { "isPartOf": { "name": "PNAS", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 113 }, "pageEnd": 6513, "pageStart": 6508, "title": "Syntax compensates for poor binding sites to encode tissue specificity of developmental enhancers", "type": "Article" }, { "authors": [ { "familyNames": [ "Freund" ], "givenNames": [ "CL" ], "type": "Person" }, { "familyNames": [ "Gregory-Evans" ], "givenNames": [ "CY" ], "type": "Person" }, { "familyNames": [ "Furukawa" ], "givenNames": [ "T" ], "type": "Person" }, { "familyNames": [ "Papaioannou" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Looser" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Ploder" ], "givenNames": [ "L" ], "type": "Person" }, { "familyNames": [ "Bellingham" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Ng" ], "givenNames": [ "D" ], "type": "Person" }, { "familyNames": [ "Herbrick" ], "givenNames": [ "JAS" ], "type": "Person" }, { "familyNames": [ "Duncan" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Scherer" ], "givenNames": [ "SW" ], "type": "Person" }, { "familyNames": [ "Tsui" ], "givenNames": [ "LC" ], "type": "Person" }, { "familyNames": [ "Loutradis-Anagnostou" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Jacobson" ], "givenNames": [ "SG" ], "type": "Person" }, { "familyNames": [ "Cepko" ], "givenNames": [ "CL" ], "type": "Person" }, { "familyNames": [ "Bhattacharya" ], "givenNames": [ "SS" ], "type": "Person" }, { "familyNames": [ "McInnes" ], "givenNames": [ "RR" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "1997" }, "id": "bib17", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1016/s0092-8674(00)80440-7" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 9390563 } ], "isPartOf": { "isPartOf": { "name": "Cell", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 91 }, "pageEnd": 553, "pageStart": 543, "title": "Cone-rod dystrophy due to mutations in a novel photoreceptor-specific homeobox gene (CRX) essential for maintenance of the photoreceptor", "type": "Article" }, { "authors": [ { "familyNames": [ "Furukawa" ], "givenNames": [ "T" ], "type": "Person" }, { "familyNames": [ "Morrow" ], "givenNames": [ "EM" ], "type": "Person" }, { "familyNames": [ "Cepko" ], "givenNames": [ "CL" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "1997" }, "id": "bib18", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1016/s0092-8674(00)80439-0" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 9390562 } ], "isPartOf": { "isPartOf": { "name": "Cell", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 91 }, "pageEnd": 541, "pageStart": 531, "title": "Crx, a novel otx-like homeobox gene, shows photoreceptor-specific expression and regulates photoreceptor differentiation", "type": "Article" }, { "authors": [ { "familyNames": [ "Ghandi" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Lee" ], "givenNames": [ "D" ], "type": "Person" }, { "familyNames": [ "Mohammad-Noori" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Beer" ], "givenNames": [ "MA" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2014" }, "id": "bib19", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1371/journal.pcbi.1003711" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 25033408 } ], "isPartOf": { "isPartOf": { "name": "PLOS Computational Biology", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 10 }, "title": "Enhanced Regulatory Sequence Prediction Using Gapped k-mer Features", "type": "Article" }, { "authors": [ { "familyNames": [ "Gisselbrecht" ], "givenNames": [ "SS" ], "type": "Person" }, { "familyNames": [ "Palagi" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Kurland" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Rogers" ], "givenNames": [ "JM" ], "type": "Person" }, { "familyNames": [ "Ozadam" ], "givenNames": [ "H" ], "type": "Person" }, { "familyNames": [ "Zhan" ], "givenNames": [ "Y" ], "type": "Person" }, { "familyNames": [ "Dekker" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Bulyk" ], "givenNames": [ "ML" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2020" }, "id": "bib20", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1016/j.molcel.2019.10.004" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 31704182 } ], "isPartOf": { "isPartOf": { "name": "Molecular Cell", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 77 }, "pageEnd": 337, "pageStart": 324, "title": "Transcriptional silencers in Drosophila serve a dual role as transcriptional enhancers in alternate cellular contexts", "type": "Article" }, { "authors": [ { "familyNames": [ "Grass" ], "givenNames": [ "JA" ], "type": "Person" }, { "familyNames": [ "Boyer" ], "givenNames": [ "ME" ], "type": "Person" }, { "familyNames": [ "Pal" ], "givenNames": [ "S" ], "type": "Person" }, { "familyNames": [ "Wu" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Weiss" ], "givenNames": [ "MJ" ], "type": "Person" }, { "familyNames": [ "Bresnick" ], "givenNames": [ "EH" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2003" }, "id": "bib21", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1073/pnas.1432147100" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 12857954 } ], "isPartOf": { "isPartOf": { "name": "PNAS", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 100 }, "pageEnd": 8816, "pageStart": 8811, "title": "GATA-1-dependent transcriptional repression of GATA-2 via disruption of positive autoregulation and domain-wide chromatin remodeling", "type": "Article" }, { "authors": [ { "familyNames": [ "Haeussler" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Zweig" ], "givenNames": [ "AS" ], "type": "Person" }, { "familyNames": [ "Tyner" ], "givenNames": [ "C" ], "type": "Person" }, { "familyNames": [ "Speir" ], "givenNames": [ "ML" ], "type": "Person" }, { "familyNames": [ "Rosenbloom" ], "givenNames": [ "KR" ], "type": "Person" }, { "familyNames": [ "Raney" ], "givenNames": [ "BJ" ], "type": "Person" }, { "familyNames": [ "Lee" ], "givenNames": [ "CM" ], "type": "Person" }, { "familyNames": [ "Lee" ], "givenNames": [ "BT" ], "type": "Person" }, { "familyNames": [ "Hinrichs" ], "givenNames": [ "AS" ], "type": "Person" }, { "familyNames": [ "Gonzalez" ], "givenNames": [ "JN" ], "type": "Person" }, { "familyNames": [ "Gibson" ], "givenNames": [ "D" ], "type": "Person" }, { "familyNames": [ "Diekhans" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Clawson" ], "givenNames": [ "H" ], "type": "Person" }, { "familyNames": [ "Casper" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Barber" ], "givenNames": [ "GP" ], "type": "Person" }, { "familyNames": [ "Haussler" ], "givenNames": [ "D" ], "type": "Person" }, { "familyNames": [ "Kuhn" ], "givenNames": [ "RM" ], "type": "Person" }, { "familyNames": [ "Kent" ], "givenNames": [ "WJ" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2019" }, "id": "bib22", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1093/nar/gky1095" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 30407534 } ], "isPartOf": { "isPartOf": { "name": "Nucleic Acids Research", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 47 }, "title": "The UCSC Genome Browser database: 2019 update", "type": "Article" }, { "authors": [ { "familyNames": [ "Hao" ], "givenNames": [ "H" ], "type": "Person" }, { "familyNames": [ "Kim" ], "givenNames": [ "DS" ], "type": "Person" }, { "familyNames": [ "Klocke" ], "givenNames": [ "B" ], "type": "Person" }, { "familyNames": [ "Johnson" ], "givenNames": [ "KR" ], "type": "Person" }, { "familyNames": [ "Cui" ], "givenNames": [ "K" ], "type": "Person" }, { "familyNames": [ "Gotoh" ], "givenNames": [ "N" ], "type": "Person" }, { "familyNames": [ "Zang" ], "givenNames": [ "C" ], "type": "Person" }, { "familyNames": [ "Gregorski" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Gieser" ], "givenNames": [ "L" ], "type": "Person" }, { "familyNames": [ "Peng" ], "givenNames": [ "W" ], "type": "Person" }, { "familyNames": [ "Fann" ], "givenNames": [ "Y" ], "type": "Person" }, { "familyNames": [ "Seifert" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Zhao" ], "givenNames": [ "K" ], "type": "Person" }, { "familyNames": [ "Swaroop" ], "givenNames": [ "A" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2012" }, "id": "bib23", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1371/journal.pgen.1002649" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 22511886 } ], "isPartOf": { "isPartOf": { "name": "PLOS Genetics", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 8 }, "title": "Transcriptional regulation of rod photoreceptor homeostasis revealed by in vivo NRL targetome analysis", "type": "Article" }, { "authors": [ { "familyNames": [ "Harris" ], "givenNames": [ "CR" ], "type": "Person" }, { "familyNames": [ "Millman" ], "givenNames": [ "KJ" ], "type": "Person" }, { "familyNames": [ "van", "der", "Walt" ], "givenNames": [ "SJ" ], "type": "Person" }, { "familyNames": [ "Gommers" ], "givenNames": [ "R" ], "type": "Person" }, { "familyNames": [ "Virtanen" ], "givenNames": [ "P" ], "type": "Person" }, { "familyNames": [ "Cournapeau" ], "givenNames": [ "D" ], "type": "Person" }, { "familyNames": [ "Wieser" ], "givenNames": [ "E" ], "type": "Person" }, { "familyNames": [ "Taylor" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Berg" ], "givenNames": [ "S" ], "type": "Person" }, { "familyNames": [ "Smith" ], "givenNames": [ "NJ" ], "type": "Person" }, { "familyNames": [ "Kern" ], "givenNames": [ "R" ], "type": "Person" }, { "familyNames": [ "Picus" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Hoyer" ], "givenNames": [ "S" ], "type": "Person" }, { "familyNames": [ "van", "Kerkwijk" ], "givenNames": [ "MH" ], "type": "Person" }, { "familyNames": [ "Brett" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Haldane" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Del", "Río" ], "givenNames": [ "JF" ], "type": "Person" }, { "familyNames": [ "Wiebe" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Peterson" ], "givenNames": [ "P" ], "type": "Person" }, { "familyNames": [ "Gérard-Marchant" ], "givenNames": [ "P" ], "type": "Person" }, { "familyNames": [ "Sheppard" ], "givenNames": [ "K" ], "type": "Person" }, { "familyNames": [ "Reddy" ], "givenNames": [ "T" ], "type": "Person" }, { "familyNames": [ "Weckesser" ], "givenNames": [ "W" ], "type": "Person" }, { "familyNames": [ "Abbasi" ], "givenNames": [ "H" ], "type": "Person" }, { "familyNames": [ "Gohlke" ], "givenNames": [ "C" ], "type": "Person" }, { "familyNames": [ "Oliphant" ], "givenNames": [ "TE" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2020" }, "id": "bib24", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1038/s41586-020-2649-2" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 32939066 } ], "isPartOf": { "isPartOf": { "name": "Nature", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 585 }, "pageEnd": 362, "pageStart": 357, "title": "Array programming with NumPy", "type": "Article" }, { "authors": [ { "familyNames": [ "Hennig" ], "givenNames": [ "AK" ], "type": "Person" }, { "familyNames": [ "Peng" ], "givenNames": [ "GH" ], "type": "Person" }, { "familyNames": [ "Chen" ], "givenNames": [ "S" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2008" }, "id": "bib25", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1016/j.brainres.2007.06.036" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 17662965 } ], "isPartOf": { "isPartOf": { "name": "Brain Research", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 1192 }, "pageEnd": 133, "pageStart": 114, "title": "Regulation of photoreceptor gene expression by Crx-associated transcription factor network", "type": "Article" }, { "authors": [ { "familyNames": [ "Hlawatsch" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Karlstetter" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Aslanidis" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Lückoff" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Walczak" ], "givenNames": [ "Y" ], "type": "Person" }, { "familyNames": [ "Plank" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Böck" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Langmann" ], "givenNames": [ "T" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2013" }, "id": "bib26", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1371/journal.pone.0060633" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 23565263 } ], "isPartOf": { "isPartOf": { "name": "PLOS ONE", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 8 }, "title": "Sterile alpha motif containing 7 (SAMD7) is a novel Crx-regulated transcriptional repressor in the retina", "type": "Article" }, { "authors": [ { "familyNames": [ "Hoffman" ], "givenNames": [ "MM" ], "type": "Person" }, { "familyNames": [ "Buske" ], "givenNames": [ "OJ" ], "type": "Person" }, { "familyNames": [ "Wang" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Weng" ], "givenNames": [ "Z" ], "type": "Person" }, { "familyNames": [ "Bilmes" ], "givenNames": [ "JA" ], "type": "Person" }, { "familyNames": [ "Noble" ], "givenNames": [ "WS" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2012" }, "id": "bib27", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1038/nmeth.1937" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 22426492 } ], "isPartOf": { "isPartOf": { "name": "Nature Methods", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 9 }, "pageEnd": 476, "pageStart": 473, "title": "Unsupervised pattern discovery in human chromatin structure through genomic segmentation", "type": "Article" }, { "authors": [ { "familyNames": [ "Hsiau" ], "givenNames": [ "THC" ], "type": "Person" }, { "familyNames": [ "Diaconu" ], "givenNames": [ "C" ], "type": "Person" }, { "familyNames": [ "Myers" ], "givenNames": [ "CA" ], "type": "Person" }, { "familyNames": [ "Lee" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Cepko" ], "givenNames": [ "CL" ], "type": "Person" }, { "familyNames": [ "Corbo" ], "givenNames": [ "JC" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2007" }, "id": "bib28", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1371/journal.pone.0000643" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 17653270 } ], "isPartOf": { "isPartOf": { "name": "PLOS ONE", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 2 }, "title": "The cis-regulatory logic of the mammalian photoreceptor transcriptional network", "type": "Article" }, { "authors": [ { "familyNames": [ "Huang" ], "givenNames": [ "D" ], "type": "Person" }, { "familyNames": [ "Petrykowska" ], "givenNames": [ "HM" ], "type": "Person" }, { "familyNames": [ "Miller" ], "givenNames": [ "BF" ], "type": "Person" }, { "familyNames": [ "Elnitski" ], "givenNames": [ "L" ], "type": "Person" }, { "familyNames": [ "Ovcharenko" ], "givenNames": [ "I" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2019" }, "id": "bib29", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1101/gr.247007.118" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 30886051 } ], "isPartOf": { "isPartOf": { "name": "Genome Research", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 29 }, "pageEnd": 667, "pageStart": 657, "title": "Identification of human silencers by correlating cross-tissue epigenetic profiles and gene expression", "type": "Article" }, { "authors": [ { "familyNames": [ "Huang" ], "givenNames": [ "Z" ], "type": "Person" }, { "familyNames": [ "Liang" ], "givenNames": [ "N" ], "type": "Person" }, { "familyNames": [ "Goñi" ], "givenNames": [ "S" ], "type": "Person" }, { "familyNames": [ "Damdimopoulos" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Wang" ], "givenNames": [ "C" ], "type": "Person" }, { "familyNames": [ "Ballaire" ], "givenNames": [ "R" ], "type": "Person" }, { "familyNames": [ "Jager" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Niskanen" ], "givenNames": [ "H" ], "type": "Person" }, { "familyNames": [ "Han" ], "givenNames": [ "H" ], "type": "Person" }, { "familyNames": [ "Jakobsson" ], "givenNames": [ "T" ], "type": "Person" }, { "familyNames": [ "Bracken" ], "givenNames": [ "AP" ], "type": "Person" }, { "familyNames": [ "Aouadi" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Venteclef" ], "givenNames": [ "N" ], "type": "Person" }, { "familyNames": [ "Kaikkonen" ], "givenNames": [ "MU" ], "type": "Person" }, { "familyNames": [ "Fan" ], "givenNames": [ "R" ], "type": "Person" }, { "familyNames": [ "Treuter" ], "givenNames": [ "E" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2021" }, "id": "bib30", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1016/j.molcel.2020.12.040" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 33503407 } ], "isPartOf": { "isPartOf": { "name": "Molecular Cell", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 81 }, "pageEnd": 968, "pageStart": 953, "title": "The corepressors GPS2 and SMRT control enhancer and silencer remodeling via eRNA transcription during inflammatory activation of macrophages", "type": "Article" }, { "authors": [ { "familyNames": [ "Hughes" ], "givenNames": [ "AEO" ], "type": "Person" }, { "familyNames": [ "Enright" ], "givenNames": [ "JM" ], "type": "Person" }, { "familyNames": [ "Myers" ], "givenNames": [ "CA" ], "type": "Person" }, { "familyNames": [ "Shen" ], "givenNames": [ "SQ" ], "type": "Person" }, { "familyNames": [ "Corbo" ], "givenNames": [ "JC" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2017" }, "id": "bib31", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1038/srep43184" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 28256534 } ], "isPartOf": { "isPartOf": { "name": "Scientific Reports", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 7 }, "title": "Cell type-specific epigenomic analysis reveals a uniquely closed chromatin architecture in mouse rod photoreceptors", "type": "Article" }, { "authors": [ { "familyNames": [ "Hughes" ], "givenNames": [ "AEO" ], "type": "Person" }, { "familyNames": [ "Myers" ], "givenNames": [ "CA" ], "type": "Person" }, { "familyNames": [ "Corbo" ], "givenNames": [ "JC" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2018" }, "id": "bib32", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1101/gr.231886.117" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 30158147 } ], "isPartOf": { "isPartOf": { "name": "Genome Research", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 28 }, "pageEnd": 1531, "pageStart": 1520, "title": "A massively parallel reporter assay reveals context-dependent activity of homeodomain binding sites in vivo", "type": "Article" }, { "authors": [ { "familyNames": [ "Hunter" ], "givenNames": [ "JD" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2007" }, "id": "bib33", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1109/MCSE.2007.55" } ], "isPartOf": { "isPartOf": { "name": "Computing in Science & Engineering", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 9 }, "pageEnd": 95, "pageStart": 90, "title": "Matplotlib: A 2D Graphics Environment", "type": "Article" }, { "authors": [ { "familyNames": [ "Irie" ], "givenNames": [ "S" ], "type": "Person" }, { "familyNames": [ "Sanuki" ], "givenNames": [ "R" ], "type": "Person" }, { "familyNames": [ "Muranishi" ], "givenNames": [ "Y" ], "type": "Person" }, { "familyNames": [ "Kato" ], "givenNames": [ "K" ], "type": "Person" }, { "familyNames": [ "Chaya" ], "givenNames": [ "T" ], "type": "Person" }, { "familyNames": [ "Furukawa" ], "givenNames": [ "T" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2015" }, "id": "bib34", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1128/MCB.00048-15" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 25986607 } ], "isPartOf": { "isPartOf": { "name": "Molecular and Cellular Biology", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 35 }, "pageEnd": 2596, "pageStart": 2583, "title": "Rax Homeoprotein Regulates Photoreceptor Cell Maturation and Survival in Association with Crx in the Postnatal Mouse Retina", "type": "Article" }, { "authors": [ { "familyNames": [ "Iype" ], "givenNames": [ "T" ], "type": "Person" }, { "familyNames": [ "Taylor" ], "givenNames": [ "DG" ], "type": "Person" }, { "familyNames": [ "Ziesmann" ], "givenNames": [ "SM" ], "type": "Person" }, { "familyNames": [ "Garmey" ], "givenNames": [ "JC" ], "type": "Person" }, { "familyNames": [ "Watada" ], "givenNames": [ "H" ], "type": "Person" }, { "familyNames": [ "Mirmira" ], "givenNames": [ "RG" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2004" }, "id": "bib35", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1210/me.2004-0006" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 15056733 } ], "isPartOf": { "isPartOf": { "name": "Molecular Endocrinology", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 18 }, "pageEnd": 1375, "pageStart": 1363, "title": "The transcriptional repressor Nkx6.1 also functions as a deoxyribonucleic acid context-dependent transcriptional activator during pancreatic beta-cell differentiation: evidence for feedback activation of the nkx6.1 gene by Nkx6.1", "type": "Article" }, { "authors": [ { "familyNames": [ "Jia" ], "givenNames": [ "L" ], "type": "Person" }, { "familyNames": [ "Oh" ], "givenNames": [ "ECT" ], "type": "Person" }, { "familyNames": [ "Ng" ], "givenNames": [ "L" ], "type": "Person" }, { "familyNames": [ "Srinivas" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Brooks" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Swaroop" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Forrest" ], "givenNames": [ "D" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2009" }, "id": "bib36", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1073/pnas.0902425106" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 19805139 } ], "isPartOf": { "isPartOf": { "name": "PNAS", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 106 }, "pageEnd": 17539, "pageStart": 17534, "title": "Retinoid-related orphan nuclear receptor RORbeta is an early-acting factor in rod photoreceptor development", "type": "Article" }, { "authors": [ { "familyNames": [ "Jiang" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Cai" ], "givenNames": [ "H" ], "type": "Person" }, { "familyNames": [ "Zhou" ], "givenNames": [ "Q" ], "type": "Person" }, { "familyNames": [ "Levine" ], "givenNames": [ "M" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "1993" }, "id": "bib37", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1002/j.1460-2075.1993.tb05989.x" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 8344257 } ], "isPartOf": { "isPartOf": { "name": "The EMBO Journal", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 12 }, "pageEnd": 3209, "pageStart": 3201, "title": "Conversion of a dorsal-dependent silencer into an enhancer: evidence for dorsal corepressors", "type": "Article" }, { "authors": [ { "familyNames": [ "Johnson" ], "givenNames": [ "KD" ], "type": "Person" }, { "familyNames": [ "Kim" ], "givenNames": [ "SI" ], "type": "Person" }, { "familyNames": [ "Bresnick" ], "givenNames": [ "EH" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2006" }, "id": "bib38", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1073/pnas.0604041103" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 17043224 } ], "isPartOf": { "isPartOf": { "name": "PNAS", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 103 }, "pageEnd": 15944, "pageStart": 15939, "title": "Differential sensitivities of transcription factor target genes underlie cell type-specific gene expression profiles", "type": "Article" }, { "authors": [ { "familyNames": [ "Junion" ], "givenNames": [ "G" ], "type": "Person" }, { "familyNames": [ "Spivakov" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Girardot" ], "givenNames": [ "C" ], "type": "Person" }, { "familyNames": [ "Braun" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Gustafson" ], "givenNames": [ "EH" ], "type": "Person" }, { "familyNames": [ "Birney" ], "givenNames": [ "E" ], "type": "Person" }, { "familyNames": [ "Furlong" ], "givenNames": [ "EEM" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2012" }, "id": "bib39", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1016/j.cell.2012.01.030" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 22304916 } ], "isPartOf": { "isPartOf": { "name": "Cell", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 148 }, "pageEnd": 486, "pageStart": 473, "title": "A transcription factor collective defines cardiac cell fate and reflects lineage history", "type": "Article" }, { "authors": [ { "familyNames": [ "Justin" ], "givenNames": [ "BK" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2021" }, "id": "bib40", "title": "Logomaker", "type": "Article", "url": "https://github.com/jbkinney/logomaker" }, { "authors": [ { "familyNames": [ "Kelley" ], "givenNames": [ "DR" ], "type": "Person" }, { "familyNames": [ "Snoek" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Rinn" ], "givenNames": [ "JL" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2016" }, "id": "bib41", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1101/gr.200535.115" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 27197224 } ], "isPartOf": { "isPartOf": { "name": "Genome Research", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 26 }, "pageEnd": 999, "pageStart": 990, "title": "Basset: Learning the regulatory code of the accessible genome with deep convolutional neural networks", "type": "Article" }, { "authors": [ { "familyNames": [ "Khan" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Fornes" ], "givenNames": [ "O" ], "type": "Person" }, { "familyNames": [ "Stigliani" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Gheorghe" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Castro-Mondragon" ], "givenNames": [ "JA" ], "type": "Person" }, { "familyNames": [ "van", "der", "Lee" ], "givenNames": [ "R" ], "type": "Person" }, { "familyNames": [ "Bessy" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Chèneby" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Kulkarni" ], "givenNames": [ "SR" ], "type": "Person" }, { "familyNames": [ "Tan" ], "givenNames": [ "G" ], "type": "Person" }, { "familyNames": [ "Baranasic" ], "givenNames": [ "D" ], "type": "Person" }, { "familyNames": [ "Arenillas" ], "givenNames": [ "DJ" ], "type": "Person" }, { "familyNames": [ "Sandelin" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Vandepoele" ], "givenNames": [ "K" ], "type": "Person" }, { "familyNames": [ "Lenhard" ], "givenNames": [ "B" ], "type": "Person" }, { "familyNames": [ "Ballester" ], "givenNames": [ "B" ], "type": "Person" }, { "familyNames": [ "Wasserman" ], "givenNames": [ "WW" ], "type": "Person" }, { "familyNames": [ "Parcy" ], "givenNames": [ "F" ], "type": "Person" }, { "familyNames": [ "Mathelier" ], "givenNames": [ "A" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2018" }, "id": "bib42", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1093/nar/gkx1188" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 29161433 } ], "isPartOf": { "isPartOf": { "name": "Nucleic Acids Research", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 46 }, "title": "JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework", "type": "Article" }, { "authors": [ { "familyNames": [ "Kimura" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Singh" ], "givenNames": [ "D" ], "type": "Person" }, { "familyNames": [ "Wawrousek" ], "givenNames": [ "EF" ], "type": "Person" }, { "familyNames": [ "Kikuchi" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Nakamura" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Shinohara" ], "givenNames": [ "T" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2000" }, "id": "bib43", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1074/jbc.275.2.1152" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 10625658 } ], "isPartOf": { "isPartOf": { "name": "The Journal of Biological Chemistry", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 275 }, "pageEnd": 1160, "pageStart": 1152, "title": "Both PCE-1/RX and OTX/CRX interactions are necessary for photoreceptor-specific gene expression", "type": "Article" }, { "authors": [ { "familyNames": [ "Klemm" ], "givenNames": [ "SL" ], "type": "Person" }, { "familyNames": [ "Shipony" ], "givenNames": [ "Z" ], "type": "Person" }, { "familyNames": [ "Greenleaf" ], "givenNames": [ "WJ" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2019" }, "id": "bib44", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1038/s41576-018-0089-8" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 30675018 } ], "isPartOf": { "isPartOf": { "name": "Nature Reviews. Genetics", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 20 }, "pageEnd": 220, "pageStart": 207, "title": "Chromatin accessibility and the regulatory epigenome", "type": "Article" }, { "authors": [ { "familyNames": [ "Koike" ], "givenNames": [ "C" ], "type": "Person" }, { "familyNames": [ "Nishida" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Ueno" ], "givenNames": [ "S" ], "type": "Person" }, { "familyNames": [ "Saito" ], "givenNames": [ "H" ], "type": "Person" }, { "familyNames": [ "Sanuki" ], "givenNames": [ "R" ], "type": "Person" }, { "familyNames": [ "Sato" ], "givenNames": [ "S" ], "type": "Person" }, { "familyNames": [ "Furukawa" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Aizawa" ], "givenNames": [ "S" ], "type": "Person" }, { "familyNames": [ "Matsuo" ], "givenNames": [ "I" ], "type": "Person" }, { "familyNames": [ "Suzuki" ], "givenNames": [ "N" ], "type": "Person" }, { "familyNames": [ "Kondo" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Furukawa" ], "givenNames": [ "T" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2007" }, "id": "bib45", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1128/MCB.01209-07" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 17908793 } ], "isPartOf": { "isPartOf": { "name": "Molecular and Cellular Biology", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 27 }, "pageEnd": 8329, "pageStart": 8318, "title": "Functional roles of Otx2 transcription factor in postnatal mouse retinal development", "type": "Article" }, { "authors": [ { "familyNames": [ "Kulakovskiy" ], "givenNames": [ "IV" ], "type": "Person" }, { "familyNames": [ "Vorontsov" ], "givenNames": [ "IE" ], "type": "Person" }, { "familyNames": [ "Yevshin" ], "givenNames": [ "IS" ], "type": "Person" }, { "familyNames": [ "Sharipov" ], "givenNames": [ "RN" ], "type": "Person" }, { "familyNames": [ "Fedorova" ], "givenNames": [ "AD" ], "type": "Person" }, { "familyNames": [ "Rumynskiy" ], "givenNames": [ "EI" ], "type": "Person" }, { "familyNames": [ "Medvedeva" ], "givenNames": [ "YA" ], "type": "Person" }, { "familyNames": [ "Magana-Mora" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Bajic" ], "givenNames": [ "VB" ], "type": "Person" }, { "familyNames": [ "Papatsenko" ], "givenNames": [ "DA" ], "type": "Person" }, { "familyNames": [ "Kolpakov" ], "givenNames": [ "FA" ], "type": "Person" }, { "familyNames": [ "Makeev" ], "givenNames": [ "VJ" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2018" }, "id": "bib46", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1093/nar/gkx1106" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 29140464 } ], "isPartOf": { "isPartOf": { "name": "Nucleic Acids Research", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 46 }, "title": "HOCOMOCO: Towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-seq analysis", "type": "Article" }, { "authors": [ { "familyNames": [ "Kwasnieski" ], "givenNames": [ "JC" ], "type": "Person" }, { "familyNames": [ "Mogno" ], "givenNames": [ "I" ], "type": "Person" }, { "familyNames": [ "Myers" ], "givenNames": [ "CA" ], "type": "Person" }, { "familyNames": [ "Corbo" ], "givenNames": [ "JC" ], "type": "Person" }, { "familyNames": [ "Cohen" ], "givenNames": [ "BA" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2012" }, "id": "bib47", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1073/pnas.1210678109" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 23129659 } ], "isPartOf": { "isPartOf": { "name": "PNAS", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 109 }, "pageEnd": 19503, "pageStart": 19498, "title": "Complex effects of nucleotide variants in a mammalian cis-regulatory element", "type": "Article" }, { "authors": [ { "familyNames": [ "Kwasnieski" ], "givenNames": [ "JC" ], "type": "Person" }, { "familyNames": [ "Fiore" ], "givenNames": [ "C" ], "type": "Person" }, { "familyNames": [ "Chaudhari" ], "givenNames": [ "HG" ], "type": "Person" }, { "familyNames": [ "Cohen" ], "givenNames": [ "BA" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2014" }, "id": "bib48", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1101/gr.173518.114" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 25035418 } ], "isPartOf": { "isPartOf": { "name": "Genome Research", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 24 }, "pageEnd": 1602, "pageStart": 1595, "title": "High-throughput functional testing of ENCODE segmentation predictions", "type": "Article" }, { "authors": [ { "familyNames": [ "Lee" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Myers" ], "givenNames": [ "CA" ], "type": "Person" }, { "familyNames": [ "Williams" ], "givenNames": [ "N" ], "type": "Person" }, { "familyNames": [ "Abdelaziz" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Corbo" ], "givenNames": [ "JC" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2010" }, "id": "bib49", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1038/gt.2010.77" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 20463752 } ], "isPartOf": { "isPartOf": { "name": "Gene Therapy", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 17 }, "pageEnd": 1399, "pageStart": 1390, "title": "Quantitative fine-tuning of photoreceptor cis-regulatory elements through affinity modulation of transcription factor binding sites", "type": "Article" }, { "authors": [ { "familyNames": [ "Lee" ], "givenNames": [ "D" ], "type": "Person" }, { "familyNames": [ "Karchin" ], "givenNames": [ "R" ], "type": "Person" }, { "familyNames": [ "Beer" ], "givenNames": [ "MA" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2011" }, "id": "bib50", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1101/gr.121905.111" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 21875935 } ], "isPartOf": { "isPartOf": { "name": "Genome Research", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 21 }, "pageEnd": 2180, "pageStart": 2167, "title": "Discriminative prediction of mammalian enhancers from DNA sequence", "type": "Article" }, { "authors": [ { "familyNames": [ "Lerner" ], "givenNames": [ "LE" ], "type": "Person" }, { "familyNames": [ "Peng" ], "givenNames": [ "GH" ], "type": "Person" }, { "familyNames": [ "Gribanova" ], "givenNames": [ "YE" ], "type": "Person" }, { "familyNames": [ "Chen" ], "givenNames": [ "S" ], "type": "Person" }, { "familyNames": [ "Farber" ], "givenNames": [ "DB" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2005" }, "id": "bib51", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1074/jbc.M500957200" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 15781457 } ], "isPartOf": { "isPartOf": { "name": "The Journal of Biological Chemistry", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 280 }, "pageEnd": 20650, "pageStart": 20642, "title": "Sp4 is expressed in retinal neurons, activates transcription of photoreceptor-specific genes, and synergizes with Crx", "type": "Article" }, { "authors": [ { "familyNames": [ "Liu" ], "givenNames": [ "YR" ], "type": "Person" }, { "familyNames": [ "Laghari" ], "givenNames": [ "ZA" ], "type": "Person" }, { "familyNames": [ "Novoa" ], "givenNames": [ "CA" ], "type": "Person" }, { "familyNames": [ "Hughes" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Webster" ], "givenNames": [ "JRM" ], "type": "Person" }, { "familyNames": [ "Goodwin" ], "givenNames": [ "PE" ], "type": "Person" }, { "familyNames": [ "Wheatley" ], "givenNames": [ "SP" ], "type": "Person" }, { "familyNames": [ "Scotting" ], "givenNames": [ "PJ" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2014" }, "id": "bib52", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1186/1471-2202-15-95" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 25103589 } ], "isPartOf": { "isPartOf": { "name": "BMC Neuroscience", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 15 }, "title": "Sox2 acts as a transcriptional repressor in neural stem cells", "type": "Article" }, { "authors": [ { "familyNames": [ "Martínez-Montañés" ], "givenNames": [ "F" ], "type": "Person" }, { "familyNames": [ "Rienzo" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Poveda-Huertes" ], "givenNames": [ "D" ], "type": "Person" }, { "familyNames": [ "Pascual-Ahuir" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Proft" ], "givenNames": [ "M" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2013" }, "id": "bib53", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1128/EC.00037-13" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 23435728 } ], "isPartOf": { "isPartOf": { "name": "Eukaryotic Cell", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 12 }, "pageEnd": 647, "pageStart": 636, "title": "Activator and repressor functions of the Mot3 transcription factor in the osmostress response of Saccharomyces cerevisiae", "type": "Article" }, { "authors": [ { "familyNames": [ "McKinney" ], "givenNames": [ "W" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2010" }, "id": "bib54", "pageEnd": 56, "pageStart": 51, "title": "Proceedings of the 9th Python in Science conference", "type": "Article" }, { "authors": [ { "familyNames": [ "Mears" ], "givenNames": [ "AJ" ], "type": "Person" }, { "familyNames": [ "Kondo" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Swain" ], "givenNames": [ "PK" ], "type": "Person" }, { "familyNames": [ "Takada" ], "givenNames": [ "Y" ], "type": "Person" }, { "familyNames": [ "Bush" ], "givenNames": [ "RA" ], "type": "Person" }, { "familyNames": [ "Saunders" ], "givenNames": [ "TL" ], "type": "Person" }, { "familyNames": [ "Sieving" ], "givenNames": [ "PA" ], "type": "Person" }, { "familyNames": [ "Swaroop" ], "givenNames": [ "A" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2001" }, "id": "bib55", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1038/ng774" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 11694879 } ], "isPartOf": { "isPartOf": { "name": "Nature Genetics", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 29 }, "pageEnd": 452, "pageStart": 447, "title": "Nrl is required for rod photoreceptor development", "type": "Article" }, { "authors": [ { "familyNames": [ "Mitton" ], "givenNames": [ "KP" ], "type": "Person" }, { "familyNames": [ "Swain" ], "givenNames": [ "PK" ], "type": "Person" }, { "familyNames": [ "Chen" ], "givenNames": [ "S" ], "type": "Person" }, { "familyNames": [ "Xu" ], "givenNames": [ "S" ], "type": "Person" }, { "familyNames": [ "Zack" ], "givenNames": [ "DJ" ], "type": "Person" }, { "familyNames": [ "Swaroop" ], "givenNames": [ "A" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2000" }, "id": "bib56", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1074/jbc.M003658200" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 10887186 } ], "isPartOf": { "isPartOf": { "name": "The Journal of Biological Chemistry", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 275 }, "pageEnd": 29799, "pageStart": 29794, "title": "The leucine zipper of NRL interacts with the CRX homeodomain. A possible mechanism of transcriptional synergy in rhodopsin regulation", "type": "Article" }, { "authors": [ { "familyNames": [ "Mitton" ], "givenNames": [ "KP" ], "type": "Person" }, { "familyNames": [ "Swain" ], "givenNames": [ "PK" ], "type": "Person" }, { "familyNames": [ "Khanna" ], "givenNames": [ "H" ], "type": "Person" }, { "familyNames": [ "Dowd" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Apel" ], "givenNames": [ "IJ" ], "type": "Person" }, { "familyNames": [ "Swaroop" ], "givenNames": [ "A" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2003" }, "id": "bib57", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1093/hmg/ddg035" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 12566383 } ], "isPartOf": { "isPartOf": { "name": "Human Molecular Genetics", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 12 }, "pageEnd": 373, "pageStart": 365, "title": "Interaction of retinal bZIP transcription factor NRL with Flt3-interacting zinc-finger protein Fiz1: possible role of Fiz1 as a transcriptional repressor", "type": "Article" }, { "authors": [ { "familyNames": [ "Moore" ], "givenNames": [ "JE" ], "type": "Person" }, { "familyNames": [ "Purcaro" ], "givenNames": [ "MJ" ], "type": "Person" }, { "familyNames": [ "Pratt" ], "givenNames": [ "HE" ], "type": "Person" }, { "familyNames": [ "Epstein" ], "givenNames": [ "CB" ], "type": "Person" }, { "familyNames": [ "Shoresh" ], "givenNames": [ "N" ], "type": "Person" }, { "familyNames": [ "Adrian" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Kawli" ], "givenNames": [ "T" ], "type": "Person" }, { "familyNames": [ "Davis" ], "givenNames": [ "CA" ], "type": "Person" }, { "familyNames": [ "Dobin" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Kaul" ], "givenNames": [ "R" ], "type": "Person" }, { "familyNames": [ "Halow" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Van", "Nostrand" ], "givenNames": [ "EL" ], "type": "Person" }, { "familyNames": [ "Freese" ], "givenNames": [ "P" ], "type": "Person" }, { "familyNames": [ "Gorkin" ], "givenNames": [ "DU" ], "type": "Person" }, { "familyNames": [ "Shen" ], "givenNames": [ "Y" ], "type": "Person" }, { "familyNames": [ "He" ], "givenNames": [ "Y" ], "type": "Person" }, { "familyNames": [ "Mackiewicz" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Pauli-Behn" ], "givenNames": [ "F" ], "type": "Person" }, { "familyNames": [ "Williams" ], "givenNames": [ "BA" ], "type": "Person" }, { "familyNames": [ "Mortazavi" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Keller" ], "givenNames": [ "CA" ], "type": "Person" }, { "familyNames": [ "Zhang" ], "givenNames": [ "XO" ], "type": "Person" }, { "familyNames": [ "Elhajjajy" ], "givenNames": [ "SI" ], "type": "Person" }, { "familyNames": [ "Huey" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Dickel" ], "givenNames": [ "DE" ], "type": "Person" }, { "familyNames": [ "Snetkova" ], "givenNames": [ "V" ], "type": "Person" }, { "familyNames": [ "Wei" ], "givenNames": [ "X" ], "type": "Person" }, { "familyNames": [ "Wang" ], "givenNames": [ "X" ], "type": "Person" }, { "familyNames": [ "Rivera-Mulia" ], "givenNames": [ "JC" ], "type": "Person" }, { "familyNames": [ "Rozowsky" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Zhang" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Chhetri" ], "givenNames": [ "SB" ], "type": "Person" }, { "familyNames": [ "Zhang" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Victorsen" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "White" ], "givenNames": [ "KP" ], "type": "Person" }, { "familyNames": [ "Visel" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Yeo" ], "givenNames": [ "GW" ], "type": "Person" }, { "familyNames": [ "Burge" ], "givenNames": [ "CB" ], "type": "Person" }, { "familyNames": [ "Lécuyer" ], "givenNames": [ "E" ], "type": "Person" }, { "familyNames": [ "Gilbert" ], "givenNames": [ "DM" ], "type": "Person" }, { "familyNames": [ "Dekker" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Rinn" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Mendenhall" ], "givenNames": [ "EM" ], "type": "Person" }, { "familyNames": [ "Ecker" ], "givenNames": [ "JR" ], "type": "Person" }, { "familyNames": [ "Kellis" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Klein" ], "givenNames": [ "RJ" ], "type": "Person" }, { "familyNames": [ "Noble" ], "givenNames": [ "WS" ], "type": "Person" }, { "familyNames": [ "Kundaje" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Guigó" ], "givenNames": [ "R" ], "type": "Person" }, { "familyNames": [ "Farnham" ], "givenNames": [ "PJ" ], "type": "Person" }, { "familyNames": [ "Cherry" ], "givenNames": [ "JM" ], "type": "Person" }, { "familyNames": [ "Myers" ], "givenNames": [ "RM" ], "type": "Person" }, { "familyNames": [ "Ren" ], "givenNames": [ "B" ], "type": "Person" }, { "familyNames": [ "Graveley" ], "givenNames": [ "BR" ], "type": "Person" }, { "familyNames": [ "Gerstein" ], "givenNames": [ "MB" ], "type": "Person" }, { "familyNames": [ "Pennacchio" ], "givenNames": [ "LA" ], "type": "Person" }, { "familyNames": [ "Snyder" ], "givenNames": [ "MP" ], "type": "Person" }, { "familyNames": [ "Bernstein" ], "givenNames": [ "BE" ], "type": "Person" }, { "familyNames": [ "Wold" ], "givenNames": [ "B" ], "type": "Person" }, { "familyNames": [ "Hardison" ], "givenNames": [ "RC" ], "type": "Person" }, { "familyNames": [ "Gingeras" ], "givenNames": [ "TR" ], "type": "Person" }, { "familyNames": [ "Stamatoyannopoulos" ], "givenNames": [ "JA" ], "type": "Person" }, { "familyNames": [ "Weng" ], "givenNames": [ "Z" ], "type": "Person" }, { "name": "ENCODE Project Consortium", "type": "Organization" } ], "datePublished": { "type": "Date", "value": "2020" }, "id": "bib58", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1038/s41586-020-2493-4" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 32728249 } ], "isPartOf": { "isPartOf": { "name": "Nature", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 583 }, "pageEnd": 710, "pageStart": 699, "title": "Expanded encyclopaedias of DNA elements in the human and mouse genomes", "type": "Article" }, { "authors": [ { "familyNames": [ "Morrow" ], "givenNames": [ "EM" ], "type": "Person" }, { "familyNames": [ "Furukawa" ], "givenNames": [ "T" ], "type": "Person" }, { "familyNames": [ "Lee" ], "givenNames": [ "JE" ], "type": "Person" }, { "familyNames": [ "Cepko" ], "givenNames": [ "CL" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "1999" }, "id": "bib59", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1242/dev.126.1.23" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 9834183 } ], "isPartOf": { "isPartOf": { "name": "Development", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 126 }, "pageEnd": 36, "pageStart": 23, "title": "NeuroD regulates multiple functions in the developing neural retina in rodent", "type": "Article" }, { "authors": [ { "familyNames": [ "Murphy" ], "givenNames": [ "DP" ], "type": "Person" }, { "familyNames": [ "Hughes" ], "givenNames": [ "AE" ], "type": "Person" }, { "familyNames": [ "Lawrence" ], "givenNames": [ "KA" ], "type": "Person" }, { "familyNames": [ "Myers" ], "givenNames": [ "CA" ], "type": "Person" }, { "familyNames": [ "Corbo" ], "givenNames": [ "JC" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2019" }, "id": "bib60", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.7554/eLife.48216" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 31633482 } ], "isPartOf": { "isPartOf": { "name": "eLife", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 8 }, "title": "Cis-regulatory basis of sister cell type divergence in the vertebrate retina", "type": "Article" }, { "authors": [ { "familyNames": [ "Ngan" ], "givenNames": [ "CY" ], "type": "Person" }, { "familyNames": [ "Wong" ], "givenNames": [ "CH" ], "type": "Person" }, { "familyNames": [ "Tjong" ], "givenNames": [ "H" ], "type": "Person" }, { "familyNames": [ "Wang" ], "givenNames": [ "W" ], "type": "Person" }, { "familyNames": [ "Goldfeder" ], "givenNames": [ "RL" ], "type": "Person" }, { "familyNames": [ "Choi" ], "givenNames": [ "C" ], "type": "Person" }, { "familyNames": [ "He" ], "givenNames": [ "H" ], "type": "Person" }, { "familyNames": [ "Gong" ], "givenNames": [ "L" ], "type": "Person" }, { "familyNames": [ "Lin" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Urban" ], "givenNames": [ "B" ], "type": "Person" }, { "familyNames": [ "Chow" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Li" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Lim" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Philip" ], "givenNames": [ "V" ], "type": "Person" }, { "familyNames": [ "Murray" ], "givenNames": [ "SA" ], "type": "Person" }, { "familyNames": [ "Wang" ], "givenNames": [ "H" ], "type": "Person" }, { "familyNames": [ "Wei" ], "givenNames": [ "CL" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2020" }, "id": "bib61", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1038/s41588-020-0581-x" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 32094912 } ], "isPartOf": { "isPartOf": { "name": "Nature Genetics", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 52 }, "pageEnd": 272, "pageStart": 264, "title": "Chromatin interaction analyses elucidate the roles of PRC2-bound silencers in mouse development", "type": "Article" }, { "authors": [ { "familyNames": [ "Pang" ], "givenNames": [ "B" ], "type": "Person" }, { "familyNames": [ "Snyder" ], "givenNames": [ "MP" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2020" }, "id": "bib62", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1038/s41588-020-0578-5" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 32094911 } ], "isPartOf": { "isPartOf": { "name": "Nature Genetics", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 52 }, "pageEnd": 263, "pageStart": 254, "title": "Systematic identification of silencers in human cells", "type": "Article" }, { "authors": [ { "familyNames": [ "Parker" ], "givenNames": [ "DS" ], "type": "Person" }, { "familyNames": [ "White" ], "givenNames": [ "MA" ], "type": "Person" }, { "familyNames": [ "Ramos" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Cohen" ], "givenNames": [ "BA" ], "type": "Person" }, { "familyNames": [ "Barolo" ], "givenNames": [ "S" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2011" }, "id": "bib63", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1126/scisignal.2002077" } ], "isPartOf": { "isPartOf": { "name": "Science Signaling", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 4 }, "title": "The cis-regulatory logic of Hedgehog gradient responses: key roles for gli binding affinity, competition, and cooperativity", "type": "Article" }, { "authors": [ { "familyNames": [ "Pedregosa" ], "givenNames": [ "F" ], "type": "Person" }, { "familyNames": [ "Varoquaux" ], "givenNames": [ "G" ], "type": "Person" }, { "familyNames": [ "Gramfort" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Michel" ], "givenNames": [ "V" ], "type": "Person" }, { "familyNames": [ "Thirion" ], "givenNames": [ "B" ], "type": "Person" }, { "familyNames": [ "Grisel" ], "givenNames": [ "O" ], "type": "Person" }, { "familyNames": [ "Blondel" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Prettenhofer" ], "givenNames": [ "P" ], "type": "Person" }, { "familyNames": [ "Weiss" ], "givenNames": [ "R" ], "type": "Person" }, { "familyNames": [ "Dubourg" ], "givenNames": [ "V" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2011" }, "id": "bib64", "isPartOf": { "isPartOf": { "name": "The Journal of Machine Learning Research", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 12 }, "pageEnd": 2830, "pageStart": 2825, "title": "Scikit-learn: Machine learning in Python", "type": "Article" }, { "authors": [ { "familyNames": [ "Peng" ], "givenNames": [ "GH" ], "type": "Person" }, { "familyNames": [ "Ahmad" ], "givenNames": [ "O" ], "type": "Person" }, { "familyNames": [ "Ahmad" ], "givenNames": [ "F" ], "type": "Person" }, { "familyNames": [ "Liu" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Chen" ], "givenNames": [ "S" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2005" }, "id": "bib65", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1093/hmg/ddi070" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 15689355 } ], "isPartOf": { "isPartOf": { "name": "Human Molecular Genetics", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 14 }, "pageEnd": 764, "pageStart": 747, "title": "The photoreceptor-specific nuclear receptor Nr2e3 interacts with CRX and exerts opposing effects on the transcription of rod versus cone genes", "type": "Article" }, { "authors": [ { "familyNames": [ "Petrykowska" ], "givenNames": [ "HM" ], "type": "Person" }, { "familyNames": [ "Vockley" ], "givenNames": [ "CM" ], "type": "Person" }, { "familyNames": [ "Elnitski" ], "givenNames": [ "L" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2008" }, "id": "bib66", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1101/gr.073817.107" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 18436892 } ], "isPartOf": { "isPartOf": { "name": "Genome Research", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 18 }, "pageEnd": 1246, "pageStart": 1238, "title": "Detection and characterization of silencers and enhancer-blockers in the greater CFTR locus", "type": "Article" }, { "authors": [ { "familyNames": [ "Phillips" ], "givenNames": [ "R" ], "type": "Person" }, { "familyNames": [ "Kondev" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Theriot" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Garcia" ], "givenNames": [ "HG" ], "type": "Person" }, { "familyNames": [ "Orme" ], "givenNames": [ "N" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2012" }, "id": "bib67", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1201/9781134111589" } ], "publisher": { "name": "Garland Science", "type": "Organization" }, "title": "Physical Biology of the Cell", "type": "Article" }, { "authors": [ { "familyNames": [ "Quinlan" ], "givenNames": [ "AR" ], "type": "Person" }, { "familyNames": [ "Hall" ], "givenNames": [ "IM" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2010" }, "id": "bib68", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1093/bioinformatics/btq033" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 20110278 } ], "isPartOf": { "isPartOf": { "name": "Bioinformatics", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 26 }, "pageEnd": 842, "pageStart": 841, "title": "Bedtools: A flexible suite of utilities for comparing genomic features", "type": "Article" }, { "authors": [ { "familyNames": [ "Rachmin" ], "givenNames": [ "I" ], "type": "Person" }, { "familyNames": [ "Amsalem" ], "givenNames": [ "E" ], "type": "Person" }, { "familyNames": [ "Golomb" ], "givenNames": [ "E" ], "type": "Person" }, { "familyNames": [ "Beeri" ], "givenNames": [ "R" ], "type": "Person" }, { "familyNames": [ "Gilon" ], "givenNames": [ "D" ], "type": "Person" }, { "familyNames": [ "Fang" ], "givenNames": [ "P" ], "type": "Person" }, { "familyNames": [ "Nechushtan" ], "givenNames": [ "H" ], "type": "Person" }, { "familyNames": [ "Kay" ], "givenNames": [ "G" ], "type": "Person" }, { "familyNames": [ "Guo" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Yiqing" ], "givenNames": [ "PL" ], "type": "Person" }, { "familyNames": [ "Foo" ], "givenNames": [ "RSY" ], "type": "Person" }, { "familyNames": [ "Fisher" ], "givenNames": [ "DE" ], "type": "Person" }, { "familyNames": [ "Razin" ], "givenNames": [ "E" ], "type": "Person" }, { "familyNames": [ "Tshori" ], "givenNames": [ "S" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2015" }, "id": "bib69", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1016/j.ijcard.2015.05.108" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 26025865 } ], "isPartOf": { "isPartOf": { "name": "International Journal of Cardiology", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 195 }, "pageEnd": 94, "pageStart": 85, "title": "FHL2 switches MITF from activator to repressor of Erbin expression during cardiac hypertrophy", "type": "Article" }, { "authors": [ { "familyNames": [ "Rister" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Razzaq" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Boodram" ], "givenNames": [ "P" ], "type": "Person" }, { "familyNames": [ "Desai" ], "givenNames": [ "N" ], "type": "Person" }, { "familyNames": [ "Tsanis" ], "givenNames": [ "C" ], "type": "Person" }, { "familyNames": [ "Chen" ], "givenNames": [ "H" ], "type": "Person" }, { "familyNames": [ "Jukam" ], "givenNames": [ "D" ], "type": "Person" }, { "familyNames": [ "Desplan" ], "givenNames": [ "C" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2015" }, "id": "bib70", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1126/science.aab3417" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 26785491 } ], "isPartOf": { "isPartOf": { "name": "Science", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 350 }, "pageEnd": 1261, "pageStart": 1258, "title": "Single-base pair differences in a shared motif determine differential Rhodopsin expression", "type": "Article" }, { "authors": [ { "familyNames": [ "Roger" ], "givenNames": [ "JE" ], "type": "Person" }, { "familyNames": [ "Hiriyanna" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Gotoh" ], "givenNames": [ "N" ], "type": "Person" }, { "familyNames": [ "Hao" ], "givenNames": [ "H" ], "type": "Person" }, { "familyNames": [ "Cheng" ], "givenNames": [ "DF" ], "type": "Person" }, { "familyNames": [ "Ratnapriya" ], "givenNames": [ "R" ], "type": "Person" }, { "familyNames": [ "Kautzmann" ], "givenNames": [ "MAI" ], "type": "Person" }, { "familyNames": [ "Chang" ], "givenNames": [ "B" ], "type": "Person" }, { "familyNames": [ "Swaroop" ], "givenNames": [ "A" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2014" }, "id": "bib71", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1172/JCI72722" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 24382353 } ], "isPartOf": { "isPartOf": { "name": "The Journal of Clinical Investigation", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 124 }, "pageEnd": 643, "pageStart": 631, "title": "OTX2 loss causes rod differentiation defect in CRX-associated congenital blindness", "type": "Article" }, { "authors": [ { "familyNames": [ "Ruzycki" ], "givenNames": [ "PA" ], "type": "Person" }, { "familyNames": [ "Zhang" ], "givenNames": [ "X" ], "type": "Person" }, { "familyNames": [ "Chen" ], "givenNames": [ "S" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2018" }, "id": "bib72", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1186/s13072-018-0212-2" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 30068366 } ], "isPartOf": { "isPartOf": { "name": "Epigenetics & Chromatin", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 11 }, "title": "CRX directs photoreceptor differentiation by accelerating chromatin remodeling at specific target sites", "type": "Article" }, { "authors": [ { "familyNames": [ "Samee" ], "givenNames": [ "MAH" ], "type": "Person" }, { "familyNames": [ "Bruneau" ], "givenNames": [ "BG" ], "type": "Person" }, { "familyNames": [ "Pollard" ], "givenNames": [ "KS" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2019" }, "id": "bib73", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1016/j.cels.2018.12.001" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 30660610 } ], "isPartOf": { "isPartOf": { "name": "Cell Systems", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 8 }, "pageEnd": 42, "pageStart": 27, "title": "A De Novo Shape Motif Discovery Algorithm Reveals Preferences of Transcription Factors for DNA Shape Beyond Sequence Motifs", "type": "Article" }, { "authors": [ { "familyNames": [ "Samee" ], "givenNames": [ "MAH" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2021" }, "id": "bib74", "title": "Shape-motif", "type": "Article", "url": "https://github.com/h-samee/shape-motif" }, { "authors": [ { "familyNames": [ "Sanuki" ], "givenNames": [ "R" ], "type": "Person" }, { "familyNames": [ "Omori" ], "givenNames": [ "Y" ], "type": "Person" }, { "familyNames": [ "Koike" ], "givenNames": [ "C" ], "type": "Person" }, { "familyNames": [ "Sato" ], "givenNames": [ "S" ], "type": "Person" }, { "familyNames": [ "Furukawa" ], "givenNames": [ "T" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2010" }, "id": "bib75", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1016/j.febslet.2009.12.030" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 20026326 } ], "isPartOf": { "isPartOf": { "name": "FEBS Letters", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 584 }, "pageEnd": 758, "pageStart": 753, "title": "Panky, a novel photoreceptor-specific ankyrin repeat protein, is a transcriptional cofactor that suppresses CRX-regulated photoreceptor genes", "type": "Article" }, { "authors": [ { "familyNames": [ "Segert" ], "givenNames": [ "JA" ], "type": "Person" }, { "familyNames": [ "Gisselbrecht" ], "givenNames": [ "SS" ], "type": "Person" }, { "familyNames": [ "Bulyk" ], "givenNames": [ "ML" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2021" }, "id": "bib76", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1016/j.tig.2021.02.002" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 33712326 } ], "isPartOf": { "isPartOf": { "name": "Trends in Genetics", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 37 }, "pageEnd": 527, "pageStart": 514, "title": "Transcriptional silencers: Driving gene expression with the brakes on", "type": "Article" }, { "authors": [ { "familyNames": [ "Sethi" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Gu" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Gumusgoz" ], "givenNames": [ "E" ], "type": "Person" }, { "familyNames": [ "Chan" ], "givenNames": [ "L" ], "type": "Person" }, { "familyNames": [ "Yan" ], "givenNames": [ "K-K" ], "type": "Person" }, { "familyNames": [ "Rozowsky" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Barozzi" ], "givenNames": [ "I" ], "type": "Person" }, { "familyNames": [ "Afzal" ], "givenNames": [ "V" ], "type": "Person" }, { "familyNames": [ "Akiyama" ], "givenNames": [ "JA" ], "type": "Person" }, { "familyNames": [ "Plajzer-Frick" ], "givenNames": [ "I" ], "type": "Person" }, { "familyNames": [ "Yan" ], "givenNames": [ "C" ], "type": "Person" }, { "familyNames": [ "Novak" ], "givenNames": [ "CS" ], "type": "Person" }, { "familyNames": [ "Kato" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Garvin" ], "givenNames": [ "TH" ], "type": "Person" }, { "familyNames": [ "Pham" ], "givenNames": [ "Q" ], "type": "Person" }, { "familyNames": [ "Harrington" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Mannion" ], "givenNames": [ "BJ" ], "type": "Person" }, { "familyNames": [ "Lee" ], "givenNames": [ "EA" ], "type": "Person" }, { "familyNames": [ "Fukuda-Yuzawa" ], "givenNames": [ "Y" ], "type": "Person" }, { "familyNames": [ "Visel" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Dickel" ], "givenNames": [ "DE" ], "type": "Person" }, { "familyNames": [ "Yip" ], "givenNames": [ "KY" ], "type": "Person" }, { "familyNames": [ "Sutton" ], "givenNames": [ "R" ], "type": "Person" }, { "familyNames": [ "Pennacchio" ], "givenNames": [ "LA" ], "type": "Person" }, { "familyNames": [ "Gerstein" ], "givenNames": [ "M" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2020" }, "id": "bib77", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1038/s41592-020-0907-8" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 32737473 } ], "isPartOf": { "isPartOf": { "name": "Nature Methods", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 17 }, "pageEnd": 814, "pageStart": 807, "title": "Supervised enhancer prediction with epigenetic pattern recognition and targeted validation", "type": "Article" }, { "authors": [ { "familyNames": [ "Spitz" ], "givenNames": [ "F" ], "type": "Person" }, { "familyNames": [ "Furlong" ], "givenNames": [ "EEM" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2012" }, "id": "bib78", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1038/nrg3207" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 22868264 } ], "isPartOf": { "isPartOf": { "name": "Nature Reviews. Genetics", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 13 }, "pageEnd": 626, "pageStart": 613, "title": "Transcription factors: From enhancer binding to developmental control", "type": "Article" }, { "authors": [ { "familyNames": [ "Srinivas" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Ng" ], "givenNames": [ "L" ], "type": "Person" }, { "familyNames": [ "Liu" ], "givenNames": [ "H" ], "type": "Person" }, { "familyNames": [ "Jia" ], "givenNames": [ "L" ], "type": "Person" }, { "familyNames": [ "Forrest" ], "givenNames": [ "D" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2006" }, "id": "bib79", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1210/me.2005-0505" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 16574740 } ], "isPartOf": { "isPartOf": { "name": "Molecular Endocrinology", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 20 }, "pageEnd": 1741, "pageStart": 1728, "title": "Activation of the blue opsin gene in cone photoreceptor development by retinoid-related orphan receptor beta", "type": "Article" }, { "authors": [ { "familyNames": [ "Stampfel" ], "givenNames": [ "G" ], "type": "Person" }, { "familyNames": [ "Kazmar" ], "givenNames": [ "T" ], "type": "Person" }, { "familyNames": [ "Frank" ], "givenNames": [ "O" ], "type": "Person" }, { "familyNames": [ "Wienerroither" ], "givenNames": [ "S" ], "type": "Person" }, { "familyNames": [ "Reiter" ], "givenNames": [ "F" ], "type": "Person" }, { "familyNames": [ "Stark" ], "givenNames": [ "A" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2015" }, "id": "bib80", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1038/nature15545" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 26550828 } ], "isPartOf": { "isPartOf": { "name": "Nature", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 528 }, "pageEnd": 151, "pageStart": 147, "title": "Transcriptional regulators form diverse groups with context-dependent regulatory functions", "type": "Article" }, { "authors": [ { "familyNames": [ "Tareen" ], "givenNames": [ "A" ], "type": "Person" }, { "familyNames": [ "Kinney" ], "givenNames": [ "JB" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2020" }, "id": "bib81", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1093/bioinformatics/btz921" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 31821414 } ], "isPartOf": { "isPartOf": { "name": "Bioinformatics", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 36 }, "pageEnd": 2274, "pageStart": 2272, "title": "Logomaker: beautiful sequence logos in Python", "type": "Article" }, { "authors": [ { "familyNames": [ "Virtanen" ], "givenNames": [ "P" ], "type": "Person" }, { "familyNames": [ "Gommers" ], "givenNames": [ "R" ], "type": "Person" }, { "familyNames": [ "Oliphant" ], "givenNames": [ "TE" ], "type": "Person" }, { "familyNames": [ "Haberland" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Reddy" ], "givenNames": [ "T" ], "type": "Person" }, { "familyNames": [ "Cournapeau" ], "givenNames": [ "D" ], "type": "Person" }, { "familyNames": [ "Burovski" ], "givenNames": [ "E" ], "type": "Person" }, { "familyNames": [ "Peterson" ], "givenNames": [ "P" ], "type": "Person" }, { "familyNames": [ "Weckesser" ], "givenNames": [ "W" ], "type": "Person" }, { "familyNames": [ "Bright" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "van", "der", "Walt" ], "givenNames": [ "SJ" ], "type": "Person" }, { "familyNames": [ "Brett" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Wilson" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Millman" ], "givenNames": [ "KJ" ], "type": "Person" }, { "familyNames": [ "Mayorov" ], "givenNames": [ "N" ], "type": "Person" }, { "familyNames": [ "Nelson" ], "givenNames": [ "ARJ" ], "type": "Person" }, { "familyNames": [ "Jones" ], "givenNames": [ "E" ], "type": "Person" }, { "familyNames": [ "Kern" ], "givenNames": [ "R" ], "type": "Person" }, { "familyNames": [ "Larson" ], "givenNames": [ "E" ], "type": "Person" }, { "familyNames": [ "Carey" ], "givenNames": [ "CJ" ], "type": "Person" }, { "familyNames": [ "Polat" ], "givenNames": [ "İ" ], "type": "Person" }, { "familyNames": [ "Feng" ], "givenNames": [ "Y" ], "type": "Person" }, { "familyNames": [ "Moore" ], "givenNames": [ "EW" ], "type": "Person" }, { "familyNames": [ "VanderPlas" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Laxalde" ], "givenNames": [ "D" ], "type": "Person" }, { "familyNames": [ "Perktold" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Cimrman" ], "givenNames": [ "R" ], "type": "Person" }, { "familyNames": [ "Henriksen" ], "givenNames": [ "I" ], "type": "Person" }, { "familyNames": [ "Quintero" ], "givenNames": [ "EA" ], "type": "Person" }, { "familyNames": [ "Harris" ], "givenNames": [ "CR" ], "type": "Person" }, { "familyNames": [ "Archibald" ], "givenNames": [ "AM" ], "type": "Person" }, { "familyNames": [ "Ribeiro" ], "givenNames": [ "AH" ], "type": "Person" }, { "familyNames": [ "Pedregosa" ], "givenNames": [ "F" ], "type": "Person" }, { "familyNames": [ "van", "Mulbregt" ], "givenNames": [ "P" ], "type": "Person" }, { "name": "SciPy 1.0 Contributors", "type": "Organization" } ], "datePublished": { "type": "Date", "value": "2020" }, "id": "bib82", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1038/s41592-019-0686-2" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 32015543 } ], "isPartOf": { "isPartOf": { "name": "Nature Methods", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 17 }, "pageEnd": 272, "pageStart": 261, "title": "SciPy 1.0: Fundamental algorithms for scientific computing in Python", "type": "Article" }, { "authors": [ { "familyNames": [ "Wang" ], "givenNames": [ "S" ], "type": "Person" }, { "familyNames": [ "Sengel" ], "givenNames": [ "C" ], "type": "Person" }, { "familyNames": [ "Emerson" ], "givenNames": [ "MM" ], "type": "Person" }, { "familyNames": [ "Cepko" ], "givenNames": [ "CL" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2014" }, "id": "bib83", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1016/j.devcel.2014.07.018" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 25155555 } ], "isPartOf": { "isPartOf": { "name": "Developmental Cell", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 30 }, "pageEnd": 527, "pageStart": 513, "title": "A gene regulatory network controls the binary fate decision of rod and bipolar cells in the vertebrate retina", "type": "Article" }, { "authors": [ { "familyNames": [ "Webber" ], "givenNames": [ "AL" ], "type": "Person" }, { "familyNames": [ "Hodor" ], "givenNames": [ "P" ], "type": "Person" }, { "familyNames": [ "Thut" ], "givenNames": [ "CJ" ], "type": "Person" }, { "familyNames": [ "Vogt" ], "givenNames": [ "TF" ], "type": "Person" }, { "familyNames": [ "Zhang" ], "givenNames": [ "T" ], "type": "Person" }, { "familyNames": [ "Holder" ], "givenNames": [ "DJ" ], "type": "Person" }, { "familyNames": [ "Petrukhin" ], "givenNames": [ "K" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2008" }, "id": "bib84", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1016/j.exer.2008.04.006" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 18547563 } ], "isPartOf": { "isPartOf": { "name": "Experimental Eye Research", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 87 }, "pageEnd": 48, "pageStart": 35, "title": "Dual role of Nr2e3 in photoreceptor development and maintenance", "type": "Article" }, { "authors": [ { "familyNames": [ "White" ], "givenNames": [ "MA" ], "type": "Person" }, { "familyNames": [ "Myers" ], "givenNames": [ "CA" ], "type": "Person" }, { "familyNames": [ "Corbo" ], "givenNames": [ "JC" ], "type": "Person" }, { "familyNames": [ "Cohen" ], "givenNames": [ "BA" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2013" }, "id": "bib85", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1073/pnas.1307449110" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 23818646 } ], "isPartOf": { "isPartOf": { "name": "PNAS", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 110 }, "pageEnd": 11957, "pageStart": 11952, "title": "Massively parallel in vivo enhancer assay reveals that highly local features determine the cis-regulatory function of ChIP-seq peaks", "type": "Article" }, { "authors": [ { "familyNames": [ "White" ], "givenNames": [ "MA" ], "type": "Person" }, { "familyNames": [ "Kwasnieski" ], "givenNames": [ "JC" ], "type": "Person" }, { "familyNames": [ "Myers" ], "givenNames": [ "CA" ], "type": "Person" }, { "familyNames": [ "Shen" ], "givenNames": [ "SQ" ], "type": "Person" }, { "familyNames": [ "Corbo" ], "givenNames": [ "JC" ], "type": "Person" }, { "familyNames": [ "Cohen" ], "givenNames": [ "BA" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2016" }, "id": "bib86", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1016/j.celrep.2016.09.066" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 27783940 } ], "isPartOf": { "isPartOf": { "name": "Cell Reports", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 17 }, "pageEnd": 1254, "pageStart": 1247, "title": "A Simple Grammar Defines Activating and Repressing cis-Regulatory Elements in Photoreceptors", "type": "Article" }, { "authors": [ { "familyNames": [ "Wunderlich" ], "givenNames": [ "Z" ], "type": "Person" }, { "familyNames": [ "Mirny" ], "givenNames": [ "LA" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2009" }, "id": "bib87", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1016/j.tig.2009.08.003" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 19815308 } ], "isPartOf": { "isPartOf": { "name": "Trends in Genetics", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 25 }, "pageEnd": 440, "pageStart": 434, "title": "Different gene regulation strategies revealed by analysis of binding motifs", "type": "Article" }, { "authors": [ { "familyNames": [ "Yang" ], "givenNames": [ "Z" ], "type": "Person" }, { "familyNames": [ "Ding" ], "givenNames": [ "K" ], "type": "Person" }, { "familyNames": [ "Pan" ], "givenNames": [ "L" ], "type": "Person" }, { "familyNames": [ "Deng" ], "givenNames": [ "M" ], "type": "Person" }, { "familyNames": [ "Gan" ], "givenNames": [ "L" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2003" }, "id": "bib88", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1016/j.ydbio.2003.08.005" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 14623245 } ], "isPartOf": { "isPartOf": { "name": "Developmental Biology", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 264 }, "pageEnd": 254, "pageStart": 240, "title": "Math5 determines the competence state of retinal ganglion cell progenitors", "type": "Article" }, { "authors": [ { "familyNames": [ "Zhao" ], "givenNames": [ "Y" ], "type": "Person" }, { "familyNames": [ "Granas" ], "givenNames": [ "D" ], "type": "Person" }, { "familyNames": [ "Stormo" ], "givenNames": [ "GD" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2009" }, "id": "bib89", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1371/journal.pcbi.1000590" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 19997485 } ], "isPartOf": { "isPartOf": { "name": "PLOS Computational Biology", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 5 }, "title": "Inferring binding energies from selected binding sites", "type": "Article" }, { "authors": [ { "familyNames": [ "Zhou" ], "givenNames": [ "J" ], "type": "Person" }, { "familyNames": [ "Troyanskaya" ], "givenNames": [ "OG" ], "type": "Person" } ], "datePublished": { "type": "Date", "value": "2015" }, "id": "bib90", "identifiers": [ { "name": "doi", "propertyID": "https://registry.identifiers.org/registry/doi", "type": "PropertyValue", "value": "10.1038/nmeth.3547" }, { "name": "pmid", "propertyID": "https://registry.identifiers.org/registry/pmid", "type": "PropertyValue", "value": 26301843 } ], "isPartOf": { "isPartOf": { "name": "Nature Methods", "type": "Periodical" }, "type": "PublicationVolume", "volumeNumber": 12 }, "pageEnd": 934, "pageStart": 931, "title": "Predicting effects of noncoding variants with deep learning-based sequence model", "type": "Article" } ], "title": "Information content differentiates enhancers from silencers in mouse photoreceptors" }, "nbformat": 4, "nbformat_minor": 4 }