
Development of an ecological biopesticide using RNAi
EMBRAPA Agroenergy
2020 - 2022
R & D project with industry to develop an ecological biopesticide against weeds using RNA interference (RNAi).
My role was to identify lethal genes to target based on specific criteria and analysis of weed genetic data.
As a result, I delivered a pipeline for finding target genes for RNAi and compiled a database of potential genes.
Overview and Objectives
Note: This project was carried out in partnership with a private industry, and due to a confidentiality agreement, it will not be possible to share details of the developed code and results.
However, I will present information within the scope of what is permitted to disclose, respecting the confidentiality agreements established.
*Off-target: all other species, which cannot be affected by the product, such as cultivars, humans, and economically important species.
About a quarter of agricultural production costs are allocated to weed, pest, and disease control. The excessive use of agrochemicals has increased these costs and generated problems such as resistance, health hazards, and negative environmental impacts. To address this, there is a need for more efficient and sustainable technologies, and RNA interference (RNAi) technology has stood out globally for its high specificity and efficiency in controlling these issues.
This project aimed to develop an ecological biopesticide using sprayed RNAi technology for weed control in crops such as corn and soybeans.
As a biological data analyst (bioinformatician), I participated in the first stage of the project, which aimed to identify potential weed genes to be targeted for RNAi silencing. A good candidate should meet several criteria, including similarities with off-targets*, gene function, and molecular stability characteristics. To evaluate these factors, data were collected, integrated, and analyzed using pipelines developed by the analysis group. Finally, a database containing the selected genes was delivered and hierarchized according to the determined criteria.
Analysis Workflow
1
Target weed and off-targets data mining
-
First, a survey of publicly available omics datasets for the target and off-target* species was conducted
-
Databases: GenBank (NCBI), EMBL
-
-
The found datasets were assessed for quality and integrity. Those that did not meet the minimum requirements were discarded
-
Tools: FastQC, HISAT2...
-
-
Additionally, a search for available RNAi analysis technologies and pipelines in the scientific literature was conducted
2
Preprocessing and integration of raw data
-
After obtaining and curating, the data were processed to proceed with the analysis
-
Some bioinformatics tools used for processing raw data were:
-
RNAseq assembly: Trinity
-
Annotation: Trinotate
-
-
Additionally, the following programming languages and tools were used for data organization and visualization:
-
Python
-
Bash/Shell
-
Excel
-
3
Search for target gene data
-
We conducted a literature search and database search to identify genes from plants that would be lethal when silenced
-
Due to the scarcity of information about our target species, we used data from model plants.
-
Databases: UNIPROT, BdTAIR, NCBI Genbank
-
-
The sequences of the selected lethal genes were searched for within the annotated transcriptomes of our target species using alignment tools such as BLAST and Python scripts.
4
Selection of target genes using established criteria
Elimination based on off-targets
-
The sequences of the target plant's lethal genes were compared with the transcriptomes of off-target organisms through alignments and mappings.
-
Tools: BLAST e HiSAT2
-
-
When a target plant gene showed a similarity of n nucleotides with any off-target, it was eliminated.
Classification based on additional criteria
-
Other metrics were used to establish a priority hierarchy for the target gene sequences.
-
These metrics were selected based on information from the literature, which highlighted the importance of certain characteristics for the success of an RNAi.*
*Information's details are protected by the confidentiality agreement.
-
To measure them, we used public tools and developed Python scripts to generate this data. Some public tools used:
-
iScore Designer: calculation of thermodynamic stability
-
siRNA design Integrated DNA Technologies: presence of cleavage site for siRNA formation by DICER
-
BOWTIE e SALMON: quantification of gene transcript expression
-
RNAfold: verification of secondary structure
-
5
Database Development
-
We analyzed the transcriptomes of the target species, measured the parameters of all transcripts, and inserted them into a database that was shared with other researchers from the project. This allowed them to visualize and select genes according to their prioritization criteria for conducting in vitro analyses
-
This database was made available to other researchers in the project on the EMBRAPA website in a private manner, taking into account the confidentiality guidelines established in the contract.
RESULTS
01
Lethal Target Genes List
A list of 22 target genes considered lethal for the plant was generated, with their metrics measured and the design of their sequences for bench testing.
02
Database Development
A database containing over 16,000 transcript options, each with all measured criteria, was delivered for future reference by researchers in the project.
03
Analysis Pipeline
We provided the client with the developed pipeline, including all its codes and tools. This way, we enabled researchers to use it later for applications on other species that were not analyzed on this occasion.
Final Considerations
This project was conducted in collaboration with a private agricultural technology company, representing a significant opportunity to expand my knowledge in the field of innovation and the development of new technologies.
The project is still ongoing, therefore, access to the entire process of product development and its applied results has unfortunately not yet been made available.
Furthermore, the experience provided by remote work was significant, allowing me to develop valuable skills such as autonomy in time management, virtual communication skills, and the ability to solve problems independently, which are distinctive compared to the traditional in-person work environment.