專利名稱:參與外遺傳基因沉默的基因的制作方法
技術(shù)領(lǐng)域:
本發(fā)明涉及編碼調(diào)控基因沉默、特別是植物基因沉默的蛋白質(zhì)的DNA。
在對(duì)發(fā)育、環(huán)境或未知信號(hào)的反應(yīng)中可以觀察到植物中先前活躍的基因表達(dá)的喪失,也稱為基因沉默。雖然其發(fā)生的頻率比突變高,但在體細(xì)胞傳遞過程中顯著穩(wěn)定?;虺聊畛醣徽J(rèn)為是轉(zhuǎn)基因表達(dá)不穩(wěn)定的有害起因,現(xiàn)在則被看作是有目的地調(diào)控基因表達(dá)的分子工具。
看來,受影響座位的染色體位置或結(jié)構(gòu)可能是決定沉默頻率和強(qiáng)度的因素。失活似乎偏向于影響存在多拷貝的基因,并被認(rèn)為是序列冗余的結(jié)果。同源依賴性的基因沉默的許多實(shí)例已有報(bào)道。進(jìn)一步分析可根據(jù)受影響座位的相對(duì)位置(順式、反式、等位、異位)、受影響基因的來源(內(nèi)源的或轉(zhuǎn)基因的)和相互作用的水平(轉(zhuǎn)錄或轉(zhuǎn)錄后)對(duì)沉默事件進(jìn)行分類。盡管轉(zhuǎn)錄后沉默看起來主要包括形成異常RNA分子和偶然而非必然地伴隨DNA甲基化,但干擾轉(zhuǎn)錄起始的沉默與DNA的高度甲基化、而且可能與沉默座位的染色質(zhì)結(jié)構(gòu)改變緊密關(guān)聯(lián)。然而,不清楚這些分子事件是基因沉默的前提還是沉默狀態(tài)的結(jié)果。
就轉(zhuǎn)錄沉默而言,沉默基因的失活狀態(tài)通過有絲分裂和減數(shù)分裂穩(wěn)定傳遞。同在其它生物中一樣,反式作用修飾基因座位(modifier locus)可能負(fù)責(zé)沉默基因失活狀態(tài)的穩(wěn)定性。料想導(dǎo)致突變蛋白的這些座位的突變可能通過干擾沉默狀態(tài)的維持或不能識(shí)別序列冗余而導(dǎo)致減少的基因沉默以及先前沉默座位的再活化。已經(jīng)報(bào)道擬南芥(Arabidopsisthaliana)DDM1基因中的突變使基因沉默解除,并且該基因編碼參與染色質(zhì)重塑(remodeling)的SWI2/SNF2樣的蛋白。但是,DDM1基因的突變引起嚴(yán)重的多效性效應(yīng)。因此,為了能夠使用基因技術(shù)改變這種效應(yīng),有必要鑒定更多的特異性修飾基因座位并表征相應(yīng)的野生型和突變型蛋白。本發(fā)明的主要目的是提供包含編碼這種蛋白的可讀框的DNA。
對(duì)于帶有可遺傳地失活的、甲基化的潮霉素抗性基因的鼠耳芥屬(Arabidopsis)株系,可通過實(shí)施例1描述的T-DNA插入誘變來鑒定根據(jù)本發(fā)明的反式作用修飾基因座位。沉默修飾基因座位的突變導(dǎo)致潮霉素抗性基因沉默的解除并恢復(fù)潮霉素抗性。用不同于潮霉素抗性基因的選擇標(biāo)記基因轉(zhuǎn)化沉默的抗性基因純合的植物,該選擇標(biāo)記基因受T-DNA1′-2′二元啟動(dòng)子的控制。選擇轉(zhuǎn)化體并篩選其后代的潮霉素抗性。篩選與特定T-DNA插入片段遺傳共分離的突變表型(潮霉素抗性)。使用重組DNA技術(shù)的常規(guī)方法克隆標(biāo)簽化基因,可表征沉默修飾基因座位的突變型和野生型DNA序列及編碼的蛋白。
在本發(fā)明中,提到基因應(yīng)理解為是指與調(diào)控序列相連的DNA編碼序列,該調(diào)控序列可使編碼序列轉(zhuǎn)錄成RNA如mRNA、rRNA、tRNA、snRNA、正義RNA或反義RNA。調(diào)控序列的實(shí)例為啟動(dòng)子序列、5′和3′非翻譯序列、內(nèi)含子和終止序列。
啟動(dòng)子理解為啟始相關(guān)DNA序列轉(zhuǎn)錄的DNA序列,也可包括作為基因表達(dá)調(diào)控器的元件如激活子、增強(qiáng)子或阻遏子。
基因的表達(dá)指在活細(xì)胞內(nèi)基因轉(zhuǎn)錄成RNA或基因轉(zhuǎn)錄及隨后翻譯成蛋白質(zhì)。就反義結(jié)構(gòu)而言,表達(dá)僅指反義DNA的轉(zhuǎn)錄。
術(shù)語轉(zhuǎn)化細(xì)胞指將核酸導(dǎo)入宿主細(xì)胞,特別是DNA分子穩(wěn)定整合到所述細(xì)胞的基因組中。
特定核苷酸或氨基酸序列的任一部分或片段稱為成分序列。
根據(jù)本發(fā)明的DNA含有編碼蛋白質(zhì)的可讀框,該蛋白質(zhì)的特征為氨基酸序列含有與SEQ ID NO3有40%或更多一致性的、至少有150個(gè)氨基酸殘基的成分序列。具體地,該可讀框編碼的蛋白可用式R1-R2-R3來描述,其中--R1、R2和R3構(gòu)成由從氨基酸殘基組Gly、Ala、Val、Leu、Ile、Phe、Pro、Ser、Thr、Cys、Met、Trp、Tyr、Asn、Gln、Asp、Glu、Lys、Arg和His中獨(dú)立選擇的氨基酸殘基組成的成分序列,--R1和R3由0-3000個(gè)氨基酸殘基獨(dú)立組成;--R2由至少150個(gè)氨基酸殘基組成;且--R2與SEQ ID NO3的對(duì)比成分序列至少有40%相同。
大多數(shù)情況下該蛋白的總長(zhǎng)在1000-3000氨基酸殘基的范圍內(nèi)。在本發(fā)明優(yōu)選的實(shí)施方案中成分序列R2由至少200個(gè)氨基酸殘基組成。成分序列R2的具體實(shí)例為下列氨基酸范圍所代表的SEQ ID NO3的成分序列1-416(相應(yīng)于外顯子2);418-583(相應(yīng)于外顯子3-5);584-890(相應(yīng)于外顯子6);892-1472(相應(yīng)于外顯子7-9);1007-1472(相應(yīng)于外顯子9);1473-1631(相應(yīng)于外顯子10-12);1632-1827(相應(yīng)于外顯子13-15);和1829-2001(相應(yīng)于外顯子16)。
在本發(fā)明優(yōu)選的實(shí)施方案中,成分序列R1和R3中至少有一個(gè)含有一個(gè)或多個(gè)長(zhǎng)度至少為50個(gè)氨基酸、并與SEQ ID NO3的對(duì)比成分序列至少60%一致的額外成分序列。這類額外成分序列的具體實(shí)例為下列氨基酸范圍所代表的SEQ ID NO3的成分序列420-525(相應(yīng)于外顯子3和4);444-525(相應(yīng)于外顯子4);526-583(相應(yīng)于外顯子5);892-971(相應(yīng)于外顯子7);892-1006(相應(yīng)于外顯子7和8);1473-1524(相應(yīng)于外顯子10);1525-1576(相應(yīng)于外顯子11);1577-1631(相應(yīng)于外顯子12);1632-1690(相應(yīng)于外顯子13);1692-1757(相應(yīng)于外顯子14);和1758-1827(相應(yīng)于外顯子15)。
根據(jù)本發(fā)明DNA的特別優(yōu)選的實(shí)施方案編碼具有由SEQ ID NO3的氨基酸478-490、584-600、617-630、654-668、676-690、718-734、776-788、1222-1233、1738-1749或1761-1770限定的成分序列的蛋白。優(yōu)選地,該編碼蛋白含有至少兩個(gè)、三個(gè)或更多所述成分序列的不同代表。所述實(shí)施方案的具體實(shí)例編碼可用SEQ ID NO3的氨基酸序列、用氨基酸殘基K替代SEQ ID NO3的705位的M或用氨基酸殘基D替代SEQID NO3的1219位的E的等位氨基酸序列表征的蛋白。
動(dòng)態(tài)程序設(shè)計(jì)算法產(chǎn)生不同種類的對(duì)比。序列對(duì)比通常有兩種方法。Needleman和Wunsch及Sellers提出的算法對(duì)比兩個(gè)序列的全長(zhǎng),提供序列的總體對(duì)比。另一方面Smith-Waterman算法產(chǎn)生局部對(duì)比。選擇計(jì)分矩陣和空位罰分后,局部對(duì)比序列中最相似的區(qū)域?qū)Α_@使得數(shù)據(jù)庫的搜索集中在序列的最高保守區(qū)。它也可對(duì)比待鑒定序列中的相似區(qū)域。為了提高使用Smith-Waterman算法對(duì)比的速度,BLAST(局部對(duì)比搜索基礎(chǔ)工具)和FASTA在對(duì)比時(shí)都加了額外的限制。
在本發(fā)明中使用BLAST方便地進(jìn)行對(duì)比,BLAST是一套相似性搜索的程序,設(shè)計(jì)用來探測(cè)所有提供的序列數(shù)據(jù)庫,無論查詢的是蛋白還是DNA。該搜索工具的BLAST 2.0版(Gapped BLAST)已在互聯(lián)網(wǎng)(目前的http//www.ncbi.nlm.nih.gov/BLAST/)向公眾發(fā)布。它使用尋求局部對(duì)比而非總體對(duì)比的啟發(fā)式算法,因此能夠檢查僅共享隔離區(qū)域的序列間的關(guān)系。BLAST搜索中的對(duì)比得分有明確的統(tǒng)計(jì)學(xué)解釋。在本發(fā)明范圍內(nèi)特別有用的是可在局部序列對(duì)比中導(dǎo)入空位的blastp程序和PSI-BLAST程序(這兩種程序?qū)⒉樵兊陌被嵝蛄信c蛋白序列數(shù)據(jù)庫比較),以及只能對(duì)兩個(gè)序列進(jìn)行局部對(duì)比的blastp變異程序。優(yōu)選將任選參數(shù)設(shè)定為默認(rèn)值來運(yùn)行所述程序。
使用BLAST的序列對(duì)比也可考慮一種氨基酸被另一種氨基酸所替代是可能保留維持蛋白結(jié)構(gòu)和功能所必需的物理化學(xué)特性,還是更可能破壞蛋白的基本結(jié)構(gòu)和功能特征。與相同氨基酸的百分率比較,這種序列相似性按照“陽性”氨基酸的百分率來定量,并且在兩可的情況中有助于將蛋白歸屬于正確的蛋白家族。
使用這些計(jì)算機(jī)程序的序列對(duì)比揭示了ATP/GTP結(jié)合基元A(SEQ IDNO3中的460-467位氨基酸)的存在,其共有序列為(Ala/Gly)XaaXaaXaaXaaGlyLys(Ser/Thr),其中(Ala/Gly)指Ala或Gly,Xaa指任意天然存在的氨基酸,(Ser/Thr)指Ser或Thr。對(duì)比還揭示了一個(gè)區(qū)域(SEQ ID NO3中的479-719位氨基酸),其與參與染色質(zhì)重塑的SWI2/SNF2家族蛋白的部分ATP酶/解旋酶結(jié)構(gòu)域相似,但與已知蛋白沒有顯著的全序列一致性。
根據(jù)本發(fā)明的具體DNA實(shí)例在SEQ ID NO1和SEQ ID NO2中有描述,其編碼SEQ ID NO3描述的鼠耳芥屬蛋白。對(duì)比后,長(zhǎng)50-500個(gè)氨基酸的SEQ ID NO3的序列片段顯示與已知蛋白序列的片段有20-50%的序列一致性。然而,SEQ ID NO3的總體對(duì)比導(dǎo)致低于30%的序列一致性。因此,本發(fā)明定義了一個(gè)新的蛋白家族,其成員的特征為其氨基酸序列含有與SEQ ID NO3的對(duì)比成分序列有40%或更多一致性的、至少150個(gè)氨基酸殘基的成分序列。優(yōu)選該氨基酸序列一致性高于50%或甚至高于55%。
編碼屬于本發(fā)明新蛋白質(zhì)家族的蛋白的DNA可從單子葉和雙子葉植物分離。優(yōu)選來源為谷物、甜菜、向日葵、冬油菜、大豆、棉花、小麥、稻、馬鈴薯、嫩莖花椰菜、花椰菜、甘藍(lán)、黃瓜、甜玉米、日本蘿卜(daikon)、蠶豆、萵苣、甜瓜、胡椒、南瓜、番茄或西瓜。但它們也可由哺乳動(dòng)物來源如小鼠或人的組織中分離。本領(lǐng)域技術(shù)人員知曉,可使用以下普通方法來適應(yīng)特定的任務(wù)。由至少15個(gè),優(yōu)選20-30個(gè)或甚至多于100個(gè)連續(xù)核苷酸組成的SEQ ID NO1或SEQ ID NO2的單鏈片段可用作探針來篩選DNA文庫中與所述片段雜交的克隆。雜交要注意的要素在Sambrook等在(分子克隆實(shí)驗(yàn)室手冊(cè)》(Molecular cloningAlaboratory manual),冷泉港實(shí)驗(yàn)室出版社,9.47-9.57和11.45-11.49章,1989)中有描述。測(cè)定雜交克隆的序列并純化含有如下完整編碼區(qū)的克隆的DNA,該編碼區(qū)編碼蛋白的特征為其氨基酸序列含有與SEQ ID NO3有40%或更多一致性的至少有150個(gè)氨基酸殘基的成分序列。然后,所述DNA可用許多常規(guī)重組DNA技術(shù)如限制酶消化、連接或聚合酶鏈?zhǔn)椒磻?yīng)分析進(jìn)一步處理。
SEQ ID NO1和SEQ ID NO2的公開使本領(lǐng)域技術(shù)人員能夠設(shè)計(jì)用于如下聚合酶鏈?zhǔn)椒磻?yīng)的寡核苷酸,該反應(yīng)試圖從包含核苷酸序列的模板擴(kuò)增DNA片段,其中核苷酸序列的特征為SEQ ID NO1或SEQ ID NO2中的15個(gè)、優(yōu)選20-30個(gè)或更多堿基對(duì)的任意連續(xù)序列。所述核苷酸含有代表SEQ ID NO1或SEQ ID NO2的15個(gè)并且優(yōu)選20-30個(gè)或更多堿基對(duì)的核苷酸序列。使用至少一個(gè)這種核苷酸進(jìn)行的聚合酶鏈?zhǔn)椒磻?yīng)以及其擴(kuò)增產(chǎn)物構(gòu)成本發(fā)明的另一個(gè)實(shí)施方案。
實(shí)施例實(shí)施例1T-DNA插入在Mittelsten Scheid等,Mol Gen Genet 228104-112,1991和Mittelsten Scheid等,Proc Natl Acad Sci USA 937114-7119,1996中描述了攜帶含有多拷貝嵌合潮霉素磷酸轉(zhuǎn)移酶基因(hpt)的轉(zhuǎn)錄沉默座位的擬南芥蘇黎世生態(tài)型(ecotype Zürich)轉(zhuǎn)基因株系A(chǔ)。通過植物真空浸入(in planta vacuum infiltration)(Bechtold等,C R Acad SciParis Life Science 3161194-1199,1993)對(duì)所述株系純合二倍體基因型進(jìn)行農(nóng)桿菌介導(dǎo)的基因轉(zhuǎn)移,產(chǎn)生4000個(gè)以上的獨(dú)立T-DNA轉(zhuǎn)化體。Mengiste等(Plant J 12945-948,1997)描述了帶有由轉(zhuǎn)錄融合到1′啟動(dòng)子的bar基因編碼區(qū)組成的T-DNA的二元載體(p1′barbi)、農(nóng)桿菌株(C58CIRifR)和轉(zhuǎn)化方案。通過用Basta溶液(150mg/l)反復(fù)噴灑萌發(fā)的幼苗來選擇轉(zhuǎn)化體(T1植物)并使其生長(zhǎng)成熟。
實(shí)施例2突變體選擇從各轉(zhuǎn)化體收獲自交種子(T2家族)。在篩選沉默表型的回復(fù)突變體之前,種子在室溫下干燥1周并在4℃冷處理最少1周。大約1000粒種子的混合等分試樣(由來自20個(gè)T2家族的50粒種子組成)表面滅菌7分鐘兩次(用含有0.1%Tween 80的5%次氯酸鈉),并用滅菌雙蒸水冼滌。每一等分試樣放入含有75ml用0.8%瓊脂固化并含有10mg/l潮霉素B(Calbiochem)的萌發(fā)培養(yǎng)基(按照Masson等,Plant J 2829-933,1992)的14-cm培養(yǎng)皿中用于選擇。為了確保在播種時(shí)平均分配,將種子與30ml含有0.4%瓊脂的相同培養(yǎng)基混合。作為陽性對(duì)照,來自潮霉素抗性株系的兩粒種子播種在每個(gè)培養(yǎng)皿的標(biāo)記位置。平板在4℃冷處理2天,隨后讓其接受21℃光照16小時(shí)和16℃黑暗8小時(shí)的交替周期。播種后每天估計(jì)潮霉素抗性,共8-15天。
實(shí)施例3突變體的分子和遺傳分析鑒定了其中一份混合試樣的11株潮霉素抗性幼苗之后,對(duì)形成該混合庫的家族重新進(jìn)行個(gè)別篩選。一個(gè)家族含有大約25%的潮霉素抗性幼苗。將該家族的6株抗性幼苗轉(zhuǎn)移到更大的含有無潮霉素的萌發(fā)培養(yǎng)基的容器中。在花結(jié)形成和根系發(fā)育之后,將植物轉(zhuǎn)移到土壤中進(jìn)一步生長(zhǎng)并形成種子。在盆栽之前,從每株植物中取出組織外植體,使其在含有或不含10mg/l潮霉素B的RCA培養(yǎng)基(表1)上產(chǎn)生愈傷組織培養(yǎng)物。愈傷組織培養(yǎng)物用作DNA和RNA分析的材料來源并用于進(jìn)一步證實(shí)該組織中的潮霉素抗性。
使用Mittelsten Scheid等描述的(Mol Gen Genet 224325-330,1994)基于CTAB的方法分離基因組DNA,并與限制酶BamhI,HpaII,MspII,DraI,EcoRV,RcaI或HindIII一起孵育。使用RNAeasy試劑盒(Qiagen)根據(jù)廠商的說明獲得總RNA。使用通過隨機(jī)引發(fā)標(biāo)記用32P標(biāo)記的DNA片段,在Church和Gilbert(Proc Natl Acad Sci USA 811991-1995,1984)描述的條件下進(jìn)行Southern和Northern雜交分析。hpt基因的編碼區(qū)、或由P35S啟動(dòng)子、hpt編碼區(qū)和終止子區(qū)組成的DNA、或bar基因的編碼區(qū)和1′啟動(dòng)子用作探針。
4個(gè)潮霉素抗性姊妹株的Northern雜交分析顯示了hpt基因轉(zhuǎn)錄的恢復(fù)。所述姊妹株的Southern雜交分析表明在復(fù)合的hpt插入片段內(nèi)部沒有可檢測(cè)的重排。通過用甲基化敏感限制酶HpaII和MspI的Southern雜交分析以及用硫酸氫鹽處理后對(duì)啟動(dòng)子區(qū)進(jìn)行基因組測(cè)序,可以叛定突變體中hpt轉(zhuǎn)基因復(fù)合體象原始株系A(chǔ)中一樣仍為高度甲基化的。與som突變中觀察到的相反,該突變對(duì)重復(fù)基因組DNA的甲基化也沒有影響。
潮霉素抗性植株和來自同一家族非選擇的姊妹株生長(zhǎng)產(chǎn)生種子,在下一代中檢查Basta抗性,并通過Southern分析記錄T-DNA插入片段的數(shù)目和大小。結(jié)果證明初始T-DNA轉(zhuǎn)化體一定含有2個(gè)在姊妹株中獨(dú)立分離的T-DNA插入片段。一個(gè)插入片段與潮霉素抗性突變體表型共分離。用該插入片段純合并缺少另一個(gè)T-DNA插入片段的植株來克隆相應(yīng)的T-DNA插入位點(diǎn)。
對(duì)突變表型的植株與含有攜帶多拷貝嵌合β-葡糖醛酸糖苷酶(gus)基因的轉(zhuǎn)錄沉默座位的擬南芥哥倫比亞生態(tài)型(ecotype Colombia)轉(zhuǎn)基因植物株系GUS-TS(可獲自Dr.H.Vauccheret,INRA,Versailles Cedex,法國(guó))之間的雜種進(jìn)行組織化學(xué)GUS染色,揭示了在mom等位基因純合的F2代中沉默GUS基因的再活化。
moml突變表型的植株近交即使在近交第9代中也不導(dǎo)致任何形態(tài)學(xué)上的異常。這與som突變體相反。
由于在F1雜種中導(dǎo)入野生型MOM等位基因,mom1突變表型與株系A(chǔ)(見實(shí)施例1)的回交導(dǎo)致重新活化的hpt基因的立即再沉默。這也與som突變體相反。
表1RCA培養(yǎng)基的組成RCA培養(yǎng)基MS macro 10× 100mlB5 micro 1000× 1ml檸檬酸鐵5mlNT維生素100× 10ml蔗糖10gMES 5ml瓊脂10gNAA 0.1mgBAP 1mgpH5.8(KOH)加至1升MS macro 10×硝酸鉀 19g硝酸銨 16.5g氯化鈣(×2H2O) 4.4g硫酸鎂(×7H2O) 3.7g磷酸二氫鉀 1.7g加至1升B5 micro 1000×硫酸鎂(×H2O) 1000mg硼酸 300mg硫酸鋅(×7H2O) 200mg碘化鉀 75mg鉬酸鈉(×2H2O) 25mg硫酸銅(×5H2O) 2.5mg氯化鈷(×6H2O) 2.5mg加至100ml檸檬酸鐵檸檬酸鐵銨 10g加至1升NT維生素100×肌醇 1000mg鹽酸硫胺素 10mg加至1升MESMES14gpH6(NaOH)加至100ml
實(shí)施例4“沉默基因”的克隆從僅含有與潮霉素抗性突變表型共分離的T-DNA的植株中分離基因組DNA。使用靠近T-DNA右邊界指向外側(cè)的3個(gè)特異性巢式引物(5′-CAT CTACGG CAA TGT ACC AGC-3′(SEQ ID NO4),5′-GAT GGG AAT TGG CTG AGTGGC-3′(SEQ ID NO5),5′-CAG TTC CAA ACG TAA AAC GGC-3′(SEQ ID NO6)),和可能與旁側(cè)植物DNA結(jié)合的幾個(gè)簡(jiǎn)并性引物中的一個(gè),對(duì)該DNA進(jìn)行根據(jù)Liu等(Plant J 8457-463,1995)的TAIL(熱不對(duì)稱交織)PCR。實(shí)際上下面7個(gè)簡(jiǎn)并性引物中的2個(gè)可導(dǎo)致特異性片段的擴(kuò)增。AD15′-NTC GAS TWT SGW GTT-3′(Liu等,同上;SEQ ID NO7)AD25′-NGT CGA SWG ANA WGA A-3′(Liu等,同上;SEQ ID NO8)AD35′-WGT GNA GWA NCA NAG A-3′(Liu等,同上;SEQ ID NO9)AD45′-WGG WAN CWG AWA NGC A-3′(SEQ ID NO10)AD55′-WCG WWG AWC ANG NCG A-3′(SEQ ID NO11)AD65′-WGC NAG TNA GWA NAA G-3′(SEQ ID NO12)AD75′-AWG CAN GNC WGA NAT A-3′(SEQ ID NO13)使用AD7獲得的較大的片段被克隆和測(cè)序。它含有50bp的T-DNA和275bp的旁側(cè)植物DNA。Southern雜交分析顯示該P(yáng)CR片段含有位于T-DNA旁側(cè)的植物DNA。用該P(yáng)CR片段篩選野生型擬南芥哥倫比亞生態(tài)型的基因組文庫(Stratagene)。鑒定了三個(gè)與此PCR片段雜交的基因組克隆。這些基因組克隆進(jìn)一步用限制酶作圖,與PCR片段雜交并互相對(duì)比。在獲得的其中一個(gè)基因組克隆(p4A-11)中,插入突變的T-DNA的旁側(cè)序列大約位于基因組序列的中部。使用pA4-11的大約800bp的EcoRI-SalI片段來獲得重疊的基因組克隆p5-6,并且使用p5-6的大約700bp的EcoRI片段來獲得與p5-6重疊的基因組克隆p30-1。使用p30-1的大約700bp的HindIII片段來獲得與p30-1重疊的基因組克隆p33-19。對(duì)所述克隆進(jìn)行測(cè)序用來設(shè)計(jì)RT-PCR的引物。進(jìn)一步使用p5-6的大約700bp的EcoRI片段來篩選根據(jù)Elledge等(Proc Natl Acad Sci USA 881737-1735,1991)的野生型擬南芥蘇黎世生態(tài)型cDNA文庫。得到了9個(gè)cDNA克隆,并測(cè)定了長(zhǎng)度為2.6kb的最長(zhǎng)克隆p17-8的序列。實(shí)施例5序列分析和對(duì)比考慮到上述克隆的鼠耳芥屬沉默基因很大,由于克隆步驟產(chǎn)生的突變或由于測(cè)序反應(yīng)的不確定性,不能完全排除該基因和蛋白真實(shí)的核苷酸和氨基酸序列可能與SEQ ID NO1、SEQ ID NO2和SEQ ID NO3給出的序列在幾個(gè)位置有偏差。另外,來自不同生態(tài)型的DNA序列可顯示等位基因的不同。因此,SEQ ID NO1、SEQ ID NO2和SEQ ID NO3代表了擬南芥蘇黎世生態(tài)型的相應(yīng)基因和蛋白,而從擬南芥哥倫比亞生態(tài)型獲得的基因組序列顯示在SEQ ID NO1第4338位(A替代T)和6721位(T替代G)核苷酸的兩處錯(cuò)配,其導(dǎo)致氨基酸殘基K替代了SEQ ID NO3第705位的M以及氨基酸殘基D替代了SEQ ID NO3第1219位的E。
順序地從兩端對(duì)2.6kb的cDNA克隆進(jìn)行序列分析,結(jié)果顯示其含有一個(gè)大的ORF和3′非翻譯序列。
對(duì)基因組克隆的分析表明,克隆p4A-11和p5-6含有比cDNA序列的同源序列和7個(gè)內(nèi)含子序列。比較基因組序列與T-DNA插入片段的旁側(cè)DNA序列,表明T-DNA的插入引起大約2kb的基因組DNA的缺失。缺失的5′端位于一個(gè)內(nèi)含子(內(nèi)含子12)中,缺失的3′端位于cDNA 3′端的下游。此cDNA克隆的5′端序列在基因組克隆p5-6的中部終止。進(jìn)行三個(gè)獨(dú)立的巢式RT-PCR來獲得更上游的另外的cDNA序列。這些RT-PCR使用的引物序列如下RT1-15′-CTGTACATACTGAGTACAATCGGA-3′(SEQ ID NO14)RT1-25′-GCTTCAATTCCTGCCTCAGTTGAAC-3′ (SEQ ID NO15)RT1-35′-CTCTACGTGCTTAACATCATGCGA-3′(SEQ ID NO16)RT1-45′-CCAGCTTCTGCTACTAGAAAGTCAG-3′ (SEQ ID NO17)RT2/3-1 5′-CTGGAGTTGCATGAAATCCTGGATG-3′ (SEQ ID NO18)RT2/3-2 5′-GCTCTTTGTAAGCTGTTCACGAGAC-3′ (SEQ ID NO19)RT2-35′-TCGCATGATGTTAAGCACGTAGAG-3′(SEQ ID NO20)RT2-45′-GAGTACTGGTCCGTGAACAGGTAAT-3′ (SEQ ID NO21)
RT3-35′-ATGCTTGCACAAGCATGGTCGGAAA-3′(SEQ ID NO22)RT3-45′-TGCAACATCGTGCATTTGCTCCAGA-3′(SEQ ID NO23)RT4-15′-CACAAGCATGAGTTTTTCCTTCCGG-3′(SEQ ID NO24)RT4-25′-CTGACTTTCTAGTAGCAGAAGCTGG-3′(SEQ ID NO25)發(fā)現(xiàn)這些基因組克隆的幾部分序列存放在鼠耳芥屬數(shù)據(jù)庫中(登錄號(hào)B67281,B62563,B20434,B20425,B21274,B08967,B11993,B20116,B12496和B10852為BAC的末端序列,Z18494和AA597930為部分cDNA序列,1999年4月13日)。該編碼蛋白序列與Swiss Protein數(shù)據(jù)庫的比較揭示了其與SWI2/SNF2家族的ATP酶/解旋酶的部分相似性(SEQ IDNO3的第479-719位的氨基酸)。該編碼的蛋白由2001個(gè)氨基酸組成,并且計(jì)算的分子量為219kD,等電點(diǎn)為5.1。在該編碼蛋白上發(fā)現(xiàn)了一個(gè)ATP/GTP-結(jié)合基元(SEQ ID NO3的第460-467位的氨基酸)和三個(gè)核定位基元(SEQ ID NO3的第362-367、832-838和858-862位氨基酸)。對(duì)HA-標(biāo)簽化的MOM蛋白的亞細(xì)胞免疫檢測(cè)證實(shí)了其核定位。也檢測(cè)到了與雞張力蛋白的肌動(dòng)蛋白結(jié)合域的相似性(SEQ ID NO3的第1899-1941位的氨基酸)和預(yù)測(cè)的跨膜結(jié)構(gòu)域(SEQ ID NO3的第995-1015位的氨基酸)。另外,該編碼蛋白含有基本由SEQ ID NO3的第177-350、1462-1672和1848-1894位氨基酸限定的三種類型的重復(fù)區(qū)域或內(nèi)部重復(fù)。實(shí)施例6其它種屬的同源基因推定的顯示與MOM蛋白有部分相似性的擬南芥富含脯氨酸/羥脯氨酸的糖蛋白公開為GenBank登錄號(hào)AAD29829?;谠搮^(qū)域的相似性為34-47%,并且只能在MOM蛋白的后一半看到(即氨基酸1368-1944)。
使用MOM cDNA克隆通過Southern雜交分析來探測(cè)來自蕪菁、番茄、煙草、玉米、小鼠、果蠅和人的基因組DNA中同源基因的存在。所有情況下都發(fā)現(xiàn)低嚴(yán)緊性條件下的雜交??蓪?duì)來自文庫的交叉雜交克隆進(jìn)行鑒定和測(cè)序。
用MOM cDNA在嚴(yán)緊條件下篩選羽衣甘藍(lán)(Brassica oleracea var.acephala)的基因組文庫(獲自Dr.Mark Cock,INRA,CNRS,里昂,法國(guó))。得到兩個(gè)陽性克隆,對(duì)其進(jìn)行亞克隆并部分測(cè)序??寺?的部分序列顯示了與MOM基因中編碼MOM蛋白的N端、ATP酶和C末端部分的不同區(qū)域的相似性(DNA水平為80-86%,氨基酸水平為62-80%)??寺?中MOM的三個(gè)推定的核定位序列完全保守??寺?的部分序列也顯示了MOM基因的相似性區(qū)域(DNA水平為64-76%,氨基酸水平為55-64%),其編碼MOM蛋白的ATP酶、推定的跨膜區(qū)、和C末端部分。克隆1和2的序列不相同,提示在甘藍(lán)(Brassia oleracea)中至少存在兩個(gè)同源基因。獲自克隆1和2的部分序列的實(shí)例在SEQ ID Nos26-33中給出。
用MOM cDNA在嚴(yán)緊條件下又篩選了Brassia rapa的基因組文庫(獲自Dr.Kinya Toriyama,Tohoku University,Sendai,日本)。得到了與MOM cDNA的5′部分和3′部分雜交的陽性信號(hào)。
此外,用MOM cDNA在不太嚴(yán)緊的條件下篩選了碧冬茄(Petuniahybrida)的基因組文庫(獲逢Dr.Jan Kooter,Vrije Universiteit,Amsterdam,荷蘭)。得到了與MOM cDNA的5′部分和3′部分雜交的陽性信號(hào)。
實(shí)施例7通過反義結(jié)構(gòu)操縱標(biāo)記基因表達(dá)2.6kb的cDNA片段以及第一次PCR使用引物RT1-1和RT1-2、第二次PCR使用引物RT1-3和RT1-4進(jìn)行巢式RT-PCR擴(kuò)增的1.8kb的RT-PCR片段都反向克隆到二元載體pbarbi53的多克隆位點(diǎn)以產(chǎn)生反義RNA。pbarbi53是p1′barbi的修飾載體,帶有一個(gè)由花椰菜花葉病毒35S啟動(dòng)子、含有Xho I、SnaBI、Hpa I和Cla I限制位點(diǎn)的多克隆位點(diǎn)以及在pl′barbi HindIII位點(diǎn)的花椰菜花葉病毒35S終止子組成的表達(dá)盒。如實(shí)施例1所描述的將產(chǎn)生的重組質(zhì)粒導(dǎo)入農(nóng)桿菌屬。擬南芥哥倫比亞生態(tài)型的轉(zhuǎn)基因植株系GUS-TS(獲自Dr.H.Vaucheret,INRA,Versailles,Cedex,法國(guó))含有攜帶多拷貝嵌合β-葡糖醛酸糖苷酶(gus)基因的轉(zhuǎn)錄沉默座位,按實(shí)施例1的描述用此重組質(zhì)粒轉(zhuǎn)化該株系并按照Mengiste等(Plant J 12945-948,1997)描述的方法選擇轉(zhuǎn)化體。在對(duì)照轉(zhuǎn)化中使用pbarbi53載體DNA。通過組織化學(xué)染色來檢查轉(zhuǎn)化體gus基因的再活化。將一片子葉在gus染色溶液(100mM磷酸鈉緩沖液(pH 7.0),0.05%5-溴-4-氯-3-吲哚-β-D-葡糖醛酸糖苷酶,0.1%疊氮鈉)中真空浸泡10分鐘,然后在37℃孵育過夜。用帶有2.6kb cDNA的重組質(zhì)粒轉(zhuǎn)化的植物可觀察到強(qiáng)的gus活性,而用帶有1.8kb RT-PCR片段的重組質(zhì)?;騪barbi53轉(zhuǎn)化的植物沒有顯示高于背景的任何gus活性。因此,2.6kb cDNA的反義RNA的表達(dá)模擬了突變表型,并且證實(shí)SEQ ID NO1、SEQ ID NO2和SEQ ID NO3中顯示的序列代表了轉(zhuǎn)錄基因沉默系統(tǒng)成分的遺傳信息。
序列表<110>Novartis AGNovartis Research Foundation<120>參與外遺傳基因沉默的基因<130>S-31005A<140><141><150>GB 9914623.5<151>1999-06-23<160>33<170>PatentIn Ver.2.1<210>1<211>10329<212>DNA<213>擬南芥<220><221>內(nèi)含子<222>(1009)..(1295)<220><221>內(nèi)含子<222>(2551)..(2673)<220><221>內(nèi)含子<222>(2753)..(2867)<220><221>內(nèi)含子<222>(3114)..(3506)<220><221>內(nèi)含子<222>(3681)..(3973)<220><221>內(nèi)含子<222>(4896)..(4975)<220><221>內(nèi)含子<222>(5218)..(5777)<220><221>內(nèi)含子<222>(5883)..(6082)<220><221>內(nèi)含子<222>(7481)..(7615)<220><221>內(nèi)含子<222>(7772)..(7914)<220><221>內(nèi)含子<222>(8071)..(8153)<220><221>內(nèi)含子<222>(8319)..(8451)<220><221>內(nèi)含子<222>(8630)..(8718)<220><221>內(nèi)含子<222>(8919)..(9000)<220><221>內(nèi)含子<222>(9212)..(9284)<400>1aatatttaag tttggtttat attctttcta gtaatctttg aaatattgta agagataatg 60cttctaataa ataacattgg atttattgga attaatgtat tgaaaaaact atgcaaatac 120tacagtgtat tttggaacga ccaaaatgat atatgtaaac tttcgttcta gtcttctaca 180tagtgtaata ggatagcgga caaggttgat cgactctaaa cattatgggt acgtaattcc 240gcagtggtta cagtctactg tcgaggccaa actggtaatt aaacgtttga agtttagaga 300aatattttga tgatgagtac cacaatcaaa gatgataggt gttaatcact gtaaaaatgt 360tgattgaata ctacgaatgc agaacatata catattttta atctctttgg aatttttgtt 420tttgttttta tcatttttga atacacgaag agctcagtta tatttcatat tgtatatgaa 480tttgttctat ttaatcttca attctagcaa catactctta tgctaattcg tttcatattt 540tagtatagta taaaaattac aaatttcaaa acaaactata agtaatatac taacatagtc 600ggtgtaacat ttcgttaatt tcacataaca tatgttaatt acatatgtac actatttttg 660aagtatttta taacttaaaa tatataaatt taaatctaag aaatcacaag catgagtttt 720tccttccggt aatcgtaaaa tcaaaaatcg ctcgctcgag aaacgccggt gctagaagag 780gaaagtaccg tacataatcc tgcgaaccca attctcgtct tcttcaaact cagttttccg 840aaaccccaaa caccgcgagg attgcatggc ctgaagaacc acttaatcga gaattgtgct 900ggaattctca aattttccct cgcgtttttc tttcacactc tcggaatcgg aaatttccac 960caagctccgt caagcgatag attctgacaa ttacacactt tcgcgcaggt atgcttcctt 1020ccctgtttta ggttggtgtt aatctatcgg tgaatcgaag gttttgggcc tcgggctttg 1080cgttttaggt ttttcagaga atcttatcta cttggggatg gatcttaggc gtttgttaga 1140tgtaactcat tagttttgca tataggaatt ttgatttgaa agttaggtcg ccggatttgt 1200agacattttg tttgatggtc ttcttcggtg ctcacattct ttgtttttaa gtgcttgatt 1260tggttgctaa ggtcctttcc gttgcgtgct ctcagtgaat atgaagaaag atgaaaagat 1320tggtttgacg gggagaacca tttacaccag atccctagca gcttcaattc ctgcctcagt 1380tgaacaagaa acccctggtt tgaggaggtc aagccggggg acaccatcta cgaaggtaat 1440aactccagct tctgctacta gaaagtcaga gagactggct ccctcacctg cttcagtttc 1500aaaaaagtcc ggtggaatcg tcaagaattc cacaccaagt tctttgcgaa ggtccaatag 1560ggggaagact gaagtatcct tgcagagttc caaaggatca gataattcta tcaggaaagg 1620agatacttca ccggatattg agcagagaaa ggatagtgtt gaagagtcga cagataagat 1680caagcctata atgtcagccc gaagttacag ggcattgttt agagggaagc tcaaggaatc 1740tgaggcatta gttgatgctt ccccaaatga agaggaacta gtagttgttg gttgttctcg 1800ccgcatacct gcaggcaatg atgatgttca aggtaaaaca gattgtccac cacctgcaga 1860tgcaggatca aaaaggctgc cagttgacga aactagtttg gacaagggca ctgattttcc 1920tttgaaatca gttacggaga ccgagaagat agtgcttgat gcatccccca tagttgaaac 1980tggggatgac agtgttatag gttcaccatc tgagaattta gagacacaaa agcttcaaga 2040tggtaagaca gattgttcac cacctgcaaa tgcagaatcg aaaacgctgc cagttggtga 2100aactagttta gaaaaagaat atccacaaaa gtttcaagat gataacacag attgtctacc 2160acctgcaaat gcagaatcaa aaaggctgcc agttggcgaa actagtttag aaaaggacac 2220tgattttcct ttgaaatcaa ctacggagac tggaaagatg gttctttatg catcccccat 2280agttgaaact agggatgaca gcgttatatg ttcaccatct acaaatttag aaacccaaaa 2340gcttcttgtc agtaaaactg gcttagaaac cgacatagtt ttgcctttga aaagaaaaag 2400agacactgca gaaattgagc tggatgcatg tgctacagtt gcaaatggag atgatcacgt 2460tatgagttct gatggggtca ttccatctcc atctgggtgc aaaaatgata atcgacctga 2520aatgtgcaac acgtgtaaaa aacggcaaaa gtaagagttt ttttagtgtt gtctgtctat 2580tgaaacgatc tgccaatgtt gaatgttggg cagatgggtt tgattcttag gatatatgtt 2640ctgtattgta atgagttgtt caaaattttg aagggtcaac ggtgattgtc aaaataggag 2700tgtttgctcc tgcattgtcc agccagttga agaatctgat aacgtgacac aggttggttt 2760ctaattactt tcggagaccc gttaatcagt ggactcttaa atagttagat actagattta 2820cttatccttt tacttgtaat ctgcaattct attttgcatt tgattaggat atgaaagaaa 2880ctggaccagt tacgagcaga gaatatgagg agaacgggca aatacaacat ggtaaatcaa 2940gtgatcccaa attctattct tcggtgtacc cagagtattg ggttcctgtg cagctatcag 3000atgtacagct ggagcaatac tgtcagactc tcttctccaa atccttatct ctttcttcac 3060tttcgaagat tgatcttgga gctctagaag aaactctcaa ttctgtaaga aaagtaagtt 3120acttgatttt aaaaacactt attcttcaat gcacttgtga gttaagtacc cagttattac 3180tggtgataag ataaagaaag caatagaaaa attgataagg tgttcaccgc attgcagcca 3240aaaaaacaat tctgtgttcc atgctttcaa gaggttgtca cataggtgtt atgcctttct 3300gtttgatgtt tggtagagca aaggttttgg gtctatttgt tttatgcttt tttgaaacac 3360atagaacctg gcaaacttga cagttttggg gttgcttaga tatacgacta ttgtcggtca 3420gcatcacatt ttctcaaggc ctctttctgc atgttaatgt gtgaatatat taaaatcttc 3480tttatgtgtt tgcaacttgt tgacagacct gtgaccatcc atacgttatg gatgcatctt 3540tgaaacaact gctcaccaag aatctggagt tgcatgaaat cctggatgta gaaattaaag 3600cgagcgggaa acttcacctc cttgataaaa tgcttactca tataaaaaag aatggtttaa 3660aagcagttgt cttctaccag gtgcattttc tattacttgc gaatgtgaat agctctatgt 3720ttgtcatgaa tacgtcactt tgtgcattct caatatatgt gcattttctt tttgacaatg 3780gaattctgtc ttgtattgaa atttgagtgg gatgaaagta tgctttttat cgtgcaatta 3840tgaagtgtaa gttagccttc agcagtcagc tagcattatg agatatgctg aactaaaatg 3900tttcttttct cttctttctt tttcgttata tgtgcctcat gtatgtttga attacagttt 3960ttattttcag caggcaacac aaacccctga agggcttctg cttggtaata ttctcgaaga 4020ttttgtgggt caaagatttg gtccaaaatc ttatgagcat gggatatatt cctcaaagaa 4080gaactccgct ataaacaatt tcaacaagga gagtcaatgc tgtgttctgc tgttggaaac 4140acgtgcctgc agtcaaacca ttaaactctt gcgagctgat gcgtttattc tttttggaag 4200cagcttgaat ccatcgcatg atgttaagca cgtagagaag ataaaaatcg agtcatgttc 4260tgaaagaact aagatattcc gattgtactc agtatgtaca gttgaagaaa aagccctgat 4320tctggctagg caaaatatgc ggcaaaataa agctgtagag aacctaaacc gctctctcac 4380gcacgcactg ctcatgtggg gggcgtcata cttatttgat aaactggatc attttcacag 4440cagtgaaact ccagattcag gagtttcatt tgaacaatct attatggacg gcgtgattca 4500tgaattctcg tccatacttt cttccaaagg tggagaagaa aatgaagtca agctgtgtct 4560acttttggag gccaagcatg ctcagggaac ttacagcagt gattctactc tatttggtga 4620agaccatatt aagttgtcag atgaagagag tccaaatata ttttggtcaa agctgttggg 4680gggaaaaaat cctatgtgga aatacccttc agatactccc caaaggaatc gaaaacgagt 4740tcagtatttt gagggttctg aagcgagtcc caaaactggc gatggtggaa atgcaaagaa 4800gcgaaagaag gcttctgatg atgtcactga tccccgggtc actgatccgc cagtagatga 4860tgatgaaaga aaggcctctg ggaaggatca catgggtaaa atagtttaat ttctgctccg 4920atacctctag tgttcattga ttatgcaact actttgctga ctatctttcc tacaggggct 4980ttggagtcac caaaagtcat aacactccag tcatcatgta aatcttctgg tacagatggt 5040acattggatg gaaatgatgc ttttggcttg tattctatgg gcagccatat ctctggaatc 5100ccagaggata tgttagctag tcaagattgg gggaaaatac cggatgaatc acagaggagg 5160ctccacactg ttttaaagcc gaagatggca aaactttgcc aagttttgca tctttcagta 5220agtggccttt ttcacctcca caacttattt tagccttgca tatgcttata tatagctgat 5280tgcaactgta gttgttacct gatttcctgt tacagccaaa tgtgagagtt ttattcttca 5340actatatcca tccgtttaag catattttat ttcttatatc tggcttcgtt accaatgcac 5400tgttaaaatg agcaactgct gcacaaaaca gtaggtagtt atgtgcctca tgtcattcat 5460tgtttattga agcaaagaaa tttctgtcta ctttacatga tccatctgtg ggagtatata 5520actatatata accttaggcc tttgtacctg gctgatcaaa gacatgtcaa aagtttatct 5580gttcgctgtt ggtatagaaa ctaatacagt gtctgatgct attttaaggt agtcttatgt 5640cttcacatat tggctaatag atgtttccgc tgtcgtgtcc atatacttct gtgattatca 5700cggtgctccg tctatcaaaa ttgtactaaa aggtattttg caatgtgtga ttggttaaca 5760gattattttg ttttcaggat gcttgcacaa gcatggtcgg aaattttctc gaatatgtta 5820ttgaaaatca ccgaatctac gaagagccag ccactacttt tcaggcattc cagatagccc 5880tggtatgaca gcatttactt tgataattta tgcattgttt ccttcatcat ctgcctttgt 5940ttagaatgtc ctcagaaggc agcactcctt tagttttaac tttccaatca taggattcaa 6000atatccatta actggccttt gatcgctgca taatatatga atagttgaca tactgaatac 6060gttgttaata atgcattttc agagttggat tgcagccttg ttggtaaagc aaattcttag 6120ccacaaagaa tctctggtcc gtgcaaattc tgaattagct ttcaaatgct ctagagtaga 6180ggtggattat atttattcga tattgtcctg catgaagagt ctgttcctgg agcatacaca 6240aggtttgcag ttcgattgct ttggtactaa ttctaaacag tcagtggtta gcacaaaact 6300agtaaatgaa agtctctcag gggctacagt gcgtgacgaa aagattaata cgaagtcgat 6360gcgaaatagc tcagaggatg aagagtgcat gactgagaag agatgtagcc attatagcac 6420agcaacaaga gatatcgaaa agactattag tggcataaaa aagaaataca agaagcaagt 6480gcaaaagctt gtacaagagc atgaggaaaa gaaaatggag ctgttaaata tgtatgcaga 6540caagaagcag aaacttgaaa ctagtaaaag tgtggaagca gcagtaattc gtattacctg 6600ttcacggacc agtactcaag tgggtgatct caaactgctg gatcataatt atgaaagaaa 6660gtttgatgaa atcaaaagtg agaaaaatga atgcctcaaa agtctggagc aaatgcacga 6720ggttgcaaag aagaagttgg ctgaggatga agcctgttgg attaatcgga taaagagctg 6780ggcagctaaa ttaaaagttt gtgttcccat tcaaagtggc aataacaagc attttagtgg 6840ttcatcaaac atttcccaaa atgctcctga tgtacaaatt tgcaataatg ctaacgttga 6900agctacttac gctgatacga attgcatggc ttccaaggtt aatcaagtgc cagaagcaga 6960aaacacatta ggaaccatgt cgggtggcag cactcaacaa gttcatgaaa tggtggatgt 7020aagaaatgac gagacaatgg atgtctcagc tttgtctcgt gaacagctta caaagagcca 7080gtccaatgag cacgcttcta tcactgtgcc tgagattttg attcctgctg actgtcaaga 7140ggaatttgcg gccttgaacg tgcatttgtc agaagaccag aattgtgaca gaataacatc 7200tgcggcatca gatgaagatg tttcatcaag ggtgccagag gtatcccagt cactcgaaaa 7260tctttctgcc tcccccgagt tttctctaaa tagagaggag gctttggtta caacagaaaa 7320tagaagaaca agtcatgtgg gttttgatac tgataacatt ttggaccagc agaatagaga 7380agattgttct cttgaccaag agattcctga cgagttagcg atgcctgtgc aacatcttgc 7440gtctgtggta gagactaggg gtgctgctga atctgatcag gtacttactg gccctgtaga 7500atagttgatg ccttgttcat ttaatctttt ctaatgttca ttcttgcttt cttgaaaata 7560acgggtagtg atcagatgtc tttttttctc ttattaaatt cacttttctg gacagtatgg 7620tcaagatata tgtcctatgc cttcttcact ggctggaaag caacctgacc cagcagcaaa 7680cactgagagc gaaaatcttg aagaagcaat tgagcctcag tctgctggtt cagaaacagt 7740agagactact gattttgctg catcacatca ggtccctatt gaagactttc cttttttact 7800agtttaaagt tatcaatctg tgttatgttc attctaagtt tccgtgagaa aaaggtgggg 7860aaatgtggtt actgatcaag tctcgttgtt gttttaaatc gactcttttg acagggtgat 7920caagttacat gtcctttgct atcttcaccg actggaaatc agcctgcgcc agaagcaaat 7980attgaaggcc aaaatatcaa cacatcagct gagccccatg tagcgggtcc agatgcagta 8040gagagtggtg attatgcagt aatagatcag gttattgcct taactaaaga caaatgtctt 8100ttgttgttta aaagtcttac atctttgtaa tgctcgttct ggatatcctg caggaaacaa 8160tgggtgctca ggatgcatgc tctctgccat ctggatcggt tggaactcag tctgacctag 8220gagcaaacat tgagggtcaa aatgtcacaa cagtggctca acttcccaca gatggatcag 8280atgcagttgt aaccggtgga tctcctgtat cagatcaggt acctgcctct gctcaaggac 8340tttcttatgt gttggtttaa aggtctagtc cttagtaatg ttgaaactaa gcaaacagtg 8400gatagtgatc atatggttat ttttgcttgt gaatttaata tttctggaca gtgtgcccag 8460gatgcatctc ctatgccatt atcttcgcct ggaaatcacc ctgatacagc agttaatatc 8520gagggtttag ataacacatc agtagctgag cctcatataa gtggatcaga tgcatgtgaa 8580atggaaattt cagaacctgg tccccaagta gagcggtcaa cctttgcaag tcagtaactg 8640ccttgggcat ttttaagtat cacctaggtc gacatatgtg attgccaaac agctaacaag 8700gagatgcctt ttgtgcagat cttttccatg aaggtggcgt ggagcattca gcaggtgtaa 8760cagctcttgt tccatcactt cttaacaatg gtacggaaca gattgccgtt caacctgttc 8820ctcaaatacc tttccctgtg ttcaacgacc cgtttctgca tgaactggag aagttgcgga 8880gagaatcaga gaactcaaag aagacttttg aagaaaaagt cagtttccct cattacccag 8940ttacctcttg ttttggttta ttttctagct gcccattgac tctcagttgc ttgtgagcag 9000aaatcaatct tgaaagctga actcgagagg aagatggctg aagtacaagc agagtttcga 9060agaaaatttc atgaggtaga agccgagcat aacaccagaa cgacaaagat agagaaggat 9120aagaatcttg ttataatgaa caaactgttg gcgaatgcgt tcttgtccaa atgtactgac 9180aagaaggtat ctccctcagg agctccaagg ggtaagtgtc gaataatata gcaaattggt 9240tttaaaaata aggcgacgaa gtcataatag cactttttct ccaggtaaaa ttcagcagct 9300agcacagaga gcagcacaag tgagtgcact gagaaattac attgctcctc agcagcttca 9360ggcatcttct tttcctgctc ctgctctggt ttcggctcct ctgcaacttc agcaatcatc 9420atttcctgct cctggtccgg ctcctctgca gcctcaggca tcttcgtttc cttcttcagt 9480ctctcgtcca tcagcccttc ttctgaattt tgcggtctgt ccaatgcctc agcccagaca 9540gcctctcata tccaacatag ctccaactcc atcagttact cctgcaacaa atccaggtct 9600gcgttctcct gcaccacacc taaactcata tagaccatcc tcttcaactc ccgtcgccac 9660agctactcca acctcgtcag tgcctcctca agctttgaca tattcagctg tgtcaattca 9720gcagcagcaa gaacaacaac cgcaacagag cttgagcagt ggattgcaga gcaacaatga 9780agtggtttgt ctttctgacg acgagtgacc taagaggaga gatggttagg gtcttagtta 9840ttgattttta gagagttaat aatagtatat atatatatgt ataagtaggt tacctaatct 9900ctgtcgttaa tctaatttag tgagtcagga accgactcgt tggctaaggt ctctcctttt 9960gaaacgcaac gttctacttt catgtatata aatacagtct gatcacacaa cacaaattga 10020tgattgaaaa tactactgat ttaactttat agaaaaccca aattatagag cgacaacttt 10080ataaacatgt caaacttcga agttaaaatt taagacccca taattttaca attatagatt 10140ttaatactcc aactattttg tgatgttaaa agaagtatcc gagtcttttc tttccagttt 10200ccccaccgtc ccatgactcc cccagccagt agaaaaagcc aaaaaagtaa acaaaaagtc 10260gttaaaaaag ttaaattaaa aaaaaaatag atagttgacg tttactaaag tgatttgaat 10320tgaacaatt 10329<210>2<211>6571<212>DNA<213>擬南芥<220><221>CDS<222>(310)..(6312)<400>2cacaagcatg agtttttcct tccggtaatc gtaaaatcaa aaatcgctcg ctcgagaaac 60gccggtgcta gaagaggaaa gtaccgtaca taatcctgcg aacccaattc tcgtcttctt 120caaactcagt tttccgaaac cccaaacacc gcgaggattg catggcctga agaaccactt 180aatcgagaat tgtgctggaa ttctcaaatt ttccctcgcg tttttctttc acactctcgg 240aatcggaaat ttccaccaag ctccgtcaag cgatagattc tgacaattac acactttcgc 300gcagtgaat atg aag aaa gat gaa aag att ggt ttg acg ggg aga acc att 351Met Lys Lys Asp Glu Lys Ile Gly Leu Thr Gly Arg Thr Ile1 5 10tac acc aga tcc cta gca gct tca att cct gcc tca gtt gaa caa gaa 399Tyr Thr Arg Ser Leu Ala Ala Ser Ile Pro Ala Ser Val Glu Gln Glu15 20 25 30acc cct ggt ttg agg agg tca agc cgg ggg aca cca tct acg aag gta 447Thr Pro Gly Leu Arg Arg Ser Ser Arg Gly Thr Pro Ser Thr Lys Val35 40 45ata act cca gct tct gct act aga aag tca gag aga ctg gct ccc tca 495Ile Thr Pro Ala Ser Ala Thr Arg Lys Ser Glu Arg Leu Ala Pro Ser50 55 60cct gct tca gtt tca aaa aag tcc ggt gga atc gtc aag aat tcc aca 543Pro Ala Ser Val Ser Lys Lys Ser Gly Gly Ile Val Lys Asn Ser Thr65 70 75cca agt tct ttg cga agg tcc aat agg ggg aag act gaa gta tcc ttg 591Pro Ser Ser Leu Arg Arg Ser Asn Arg Gly Lys Thr Glu Val Ser Leu80 85 90cag agt tcc aaa gga tca gat aat tct atc agg aaa gga gat act tca 639Gln Ser Ser Lys Gly Ser Asp Asn Ser Ile Arg Lys Gly Asp Thr Ser95 100 105 110ccg gat att gag cag aga aag gat agt gtt gaa gag tcg aca gat aag 687Pro Asp Ile Glu Gln Arg Lys Asp Ser Val Glu Glu Ser Thr Asp Lys115 120 125atc aag cct ata atg tca gcc cga agt tac agg gca ttg ttt aga ggg 735Ile Lys Pro Ile Met Ser Ala Arg Ser Tyr Arg Ala Leu Phe Arg Gly130 135 140aag ctc aag gaa tct gag gca tta gtt gat gct tcc cca aat gaa gag 783Lys Leu Lys Glu Ser Glu Ala Leu Val Asp Ala Ser Pro Asn Glu Glu145 150 155gaa cta gta gtt gtt ggt tgt tct cgc cgc ata cct gca ggc aat gat 831Glu Leu Val Val Val Gly Cys Ser Arg Arg Ile Pro Ala Gly Asn Asp160 165 170gat gtt caa ggt aaa aca gat tgt cca cca cct gca gat gca gga tca 879Asp Val Gln Gly Lys Thr Asp Cys Pro Pro Pro Ala Asp Ala Gly Ser175 180 185 190aaa agg ctg cca gtt gac gaa act agt ttg gac aag ggc act gat ttt 927Lys Arg Leu Pro Val Asp Glu Thr Ser Leu Asp Lys Gly Thr Asp Phe195 200 205cct ttg aaa tca gtt acg gag acc gag aag ata gtg ctt gat gca tcc 975Pro Leu Lys Ser Val Thr Glu Thr Glu Lys Ile Val Leu Asp Ala Ser210 215 220ccc ata gtt gaa act ggg gat gac agt gtt ata ggt tca cca tct gag 1023Pro Ile Val Glu Thr Gly Asp Asp Ser Val Ile Gly Ser Pro Ser Glu225 230 235aat tta gag aca caa aag ctt caa gat ggt aag aca gat tgt tca cca 1071Asn Leu Glu Thr Gln Lys Leu Gln Asp Gly Lys Thr Asp Cys Ser Pro240 245 250cct gca aat gca gaa tcg aaa acg ctg cca gtt ggt gaa act agt tta 1119Pro Ala Asn Ala Glu Ser Lys Thr Leu Pro Val Gly Glu Thr Ser Leu255 260 265270gaa aaa gaa tat cca caa aag ttt caa gat gat aac aca gat tgt cta 1167Glu Lys Glu Tyr Pro Gln Lys Phe Gln Asp Asp Asn Thr Asp Cys Leu275 280 285cca cct gca aat gca gaa tca aaa agg ctg cca gtt ggc gaa act agt 1215Pro Pro Ala Asn Ala Glu Ser Lys Arg Leu Pro Val Gly Glu Thr Ser290 295 300tta gaa aag gac act gat ttt cct ttg aaa tca act acg gag act gga 1263Leu Glu Lys Asp Thr Asp Phe Pro Leu Lys Ser Thr Thr Glu Thr Gly305 310 315aag atg gtt ctt tat gca tcc ccc ata gtt gaa act agg gat gac agc 1311Lys Met Val Leu Tyr Ala Ser Pro Ile Val Glu Thr Arg Asp Asp Ser320 325 330gtt ata tgt tca cca tct aca aat tta gaa acc caa aag ctt ctt gtc 1359Val Ile Cys Ser Pro Ser Thr Asn Leu Glu Thr Gln Lys Leu Leu Val335 340 345 350agt aaa act ggc tta gaa acc gac ata gtt ttg cct ttg aaa aga aaa 1407Ser Lys Thr Gly Leu Glu Thr Asp Ile Val Leu Pro Leu Lys Arg Lys355 360365aga gac act gca gaa att gag ctg gat gca tgt gct aca gtt gca aat 1455Arg Asp Thr Ala Glu Ile Glu Leu Asp Ala Cys Ala Thr Val Ala Asn370 375 380gga gat gat cac gtt atg agt tct gat ggg gtc att cca tct cca tct 1503Gly Asp Asp His Val Met Ser Ser Asp Gly Val Ile Pro Ser Pro Ser385 390 395ggg tgc aaa aat gat aat cga cct gaa atg tgc aac acg tgt aaa aaa 1551Gly Cys Lys Asn Asp Asn Arg Pro Glu Met Cys Asn Thr Cys Lys Lys400 405 410cgg caa aag gtc aac ggt gat tgt caa aat agg agt gtt tgc tcc tgc 1599Arg Gln Lys Val Asn Gly Asp Cys Gln Asn Arg Ser Val Cys Ser Cys415 420 425 430att gtc cag cca gtt gaa gaa tct gat aac gtg aca cag gat atg aaa 1647Ile Val Gln Pro Val Glu Glu Ser Asp Asn Val Thr Gln Asp Met Lys435 440 445gaa act gga cca gtt acg agc aga gaa tat gag gag aac ggg caa ata 1695Glu Thr Gly Pro Val Thr Ser Arg Glu Tyr Glu Glu Asn Gly Gln Ile450 455 460caa cat ggt aaa tca agt gat ccc aaa ttc tat tct tcg gtg tac cca 1743Gln His Gly Lys Ser Ser Asp Pro Lys Phe Tyr Ser Ser Val Tyr Pro465 470 475gag tat tgg gtt cct gtg cag cta tca gat gta cag ctg gag caa tac 1791Glu Tyr Trp Val Pro Val Gln Leu Ser Asp Val Gln Leu Glu Gln Tyr480 485 490tgt cag act ctc ttc tcc aaa tcc tta tct ctt tct tca ctt tcg aag 1839Cys Gln Thr Leu Phe Ser Lys Ser Leu Ser Leu Ser Ser Leu Ser Lys495 500 505 510att gat ctt gga gct cta gaa gaa act ctc aat tct gta aga aaa acc 1887Ile Asp Leu Gly Ala Leu Glu Glu Thr Leu Asn Ser Val Arg Lys Thr515 520 525tgt gac cat cca tac gtt atg gat gca tct ttg aaa caa ctg ctc acc 1935Cys Asp His Pro Tyr Val Met Asp Ala Ser Leu Lys Gln Leu Leu Thr530 535 540aag aat ctg gag ttg cat gaa atc ctg gat gta gaa att aaa gcg agc1983Lys Asn Leu Glu Leu His Glu Ile Leu Asp Val Glu Ile Lys Ala Ser545 550 555ggg aaa ctt cac ctc ctt gat aaa atg ctt act cat ata aaa aag aat 2031Gly Lys Leu His Leu Leu Asp Lys Met Leu Thr His Ile Lys Lys Asn560 565 570ggt tta aaa gca gtt gtc ttc tac cag gca aca caa acc cct gaa ggg 2079Gly Leu Lys Ala Val Val Phe Tyr Gln Ala Thr Gln Thr Pro Glu Gly575 580 585 590ctt ctg ctt ggt aat att ctc gaa gat ttt gtg ggt caa aga ttt ggt 2127Leu Leu Leu Gly Asn Ile Leu Glu Asp Phe Val Gly Gln Arg Phe Gly595 600 605cca aaa tct tat gag cat ggg ata tat tcc tca aag aag aac tcc gct 2175Pro Lys Ser Tyr Glu His Gly Ile Tyr Ser Ser Lys Lys Asn Ser Ala610 615 620ata aac aat ttc aac aag gag agt caa tgc tgt gtt ctg ctg ttg gaa 2223Ile Asn Asn Phe Asn Lys Glu Ser Gln Cys Cys Val Leu Leu Leu Glu625 630 635aca cgt gcc tgc agt caa acc att aaa ctc ttg cga gct gat gcg ttt 2271Thr Arg Ala Cys Ser Gln Thr Ile Lys Leu Leu Arg Ala Asp Ala Phe640 645 650att ctt ttt gga agc agc ttg aat cca tcg cat gat gtt aag cac gta 2319Ile Leu Phe Gly Ser Ser Leu Asn Pro Ser His Asp Val Lys His Val655 660 665 670gag aag ata aaa atc gag tca tgt tct gaa aga act aag ata ttc cga 2367Glu Lys Ile Lys Ile Glu Ser Cys Ser Glu Arg Thr Lys Ile Phe Arg675 680 685ttg tac tca gta tgt aca gtt gaa gaa aaa gcc ctg att ctg gct agg 2415Leu Tyr Ser Val Cys Thr Val Glu Glu Lys Ala Leu Ile Leu Ala Arg690 695 700caa aat atg cgg caa aat aaa gct gta gag aac cta aac cgc tct ctc 2463Gln Asn Met Arg Gln Asn Lys Ala Val Glu Asn Leu Asn Arg Ser Leu705 710 715acg cac gca ctg ctc atg tgg ggg gcg tca tac tta ttt gat aaa ctg 2511Thr His Ala Leu Leu Met Trp Gly Ala Ser Tyr Leu Phe Asp Lys Leu720 725 730gat cat ttt cac agc agt gaa act cca gat tca gga gtt tca ttt gaa 2559Asp His Phe His Ser Ser Glu Thr Pro Asp Ser Gly Val Ser Phe Glu735 740 745 750caa tct att atg gac ggc gtg att cat gaa ttc tcg tcc ata ctt tct 2607Gln Ser Ile Met Asp Gly Val Ile His Glu Phe Ser Ser Ile Leu Ser755 760 765tcc aaa ggt gga gaa gaa aat gaa gtc aag ctg tgt cta ctt ttg gag 2655Ser Lys Gly Gly Glu Glu Asn Glu Val Lys Leu Cys Leu Leu Leu Glu770 775 780gcc aag cat gct cag gga act tac agc agt gat tct act cta ttt ggt 2703Ala Lys His Ala Gln Gly Thr Tyr Ser Ser Asp Ser Thr Leu Phe Gly785 790 795gaa gac cat att aag ttg tca gat gaa gag agt cca aat ata ttt tgg 2751Glu Asp His Ile Lys Leu Ser Asp Glu Glu Ser Pro Asn Ile Phe Trp800 805 810tca aag ctg ttg ggg gga aaa aat cct atg tgg aaa tac cct tca gat 2799Ser Lys Leu Leu Gly Gly Lys Asn Pro Met Trp Lys Tyr Pro Ser Asp815 820 825 830act ccc caa agg aat cga aaa cga gtt cag tat ttt gag ggt tct gaa 2847Thr Pro Gln Arg Asn Arg Lys Arg Val Gln Tyr Phe Glu Gly Ser Glu835 840 845gcg agt ccc aaa act ggc gat ggt gga aat gca aag aag cga aag aag 2895Ala Ser Pro Lys Thr Gly Asp Gly Gly Asn Ala Lys Lys Arg Lys Lys
850 855 860gct tct gat gat gtc act gat ccc cgg gtc act gat ccg cca gta gat 2943Ala Ser Asp Asp Val Thr Asp Pro Arg Val Thr Asp Pro Pro Val Asp865 870 875gat gat gaa aga aag gcc tct ggg aag gat cac atg ggg gct ttg gag 2991Asp Asp Glu Arg Lys Ala Ser Gly Lys Asp His Met Gly Ala Leu Glu880 885 890tca cca aaa gtc ata aca ctc cag tca tca tgt aaa tct tct ggt aca 3039Ser Pro Lys Val Ile Thr Leu Gln Ser Ser Cys Lys Ser Ser Gly Thr895 900 905 910gat ggt aca ttg gat gga aat gat gct ttt ggc ttg tat tct atg ggc 3087Asp Gly Thr Leu Asp Gly Asn Asp Ala Phe Gly Leu Tyr Ser Met Gly915 920 925agc cat atc tct gga atc cca gag gat atg tta gct agt caa gat tgg 3135Ser His Ile Ser Gly Ile Pro Glu Asp Met Leu Ala Ser Gln Asp Trp930 935 940ggg aaa ata ccg gat gaa tca cag agg agg ctc cac act gtt tta aag 3183Gly Lys Ile Pro Asp Glu Ser Gln Arg Arg Leu His Thr Val Leu Lys945 950 955ccg aag atg gca aaa ctt tgc caa gtt ttg cat ctt tca gat gct tgc 3231Pro Lys Met Ala Lys Leu Cys Gln Val Leu His Leu Ser Asp Ala Cys960 965 970aca agc atg gtc gga aat ttt ctc gaa tat gtt att gaa aat cac cga 3279Thr Ser Met Val Gly Asn Phe Leu Glu Tyr Val Ile Glu Asn His Arg975 980 985 990atc tac gaa gag cca gcc act act ttt cag gca ttc cag ata gcc ctg 3327Ile Tyr Glu Glu Pro Ala Thr Thr Phe Gln Ala Phe Gln Ile Ala Leu99510001005agt tgg att gca gcc ttg ttg gta aag caa att ctt agc cac aaa gaa 3375Ser Trp Ile Ala Ala Leu Leu Val Lys Gln Ile Leu Ser His Lys Glu101010151020tct ctg gtc cgt gca aat tct gaa tta gct ttc aaa tgc tct aga gta 3423Ser Leu Val Arg Ala Asn Ser Glu Leu Ala Phe Lys Cys Ser Arg Val102510301035gag gtg gat tat att tat tcg ata ttg tcc tgc atg aag agt ctg ttc 3471Glu Val Asp Tyr Ile Tyr Ser Ile Leu Ser Cys Met Lys Ser Leu Phe104010451050ctg gag cat aca caa ggt ttg cag ttc gat tgc ttt ggt act aat tct 3519Leu Glu His Thr Gln Gly Leu Gln Phe Asp Cys Phe Gly Thr Asn Ser1055 106010651070aaa cag tca gtg gtt agc aca aaa cta gta aat gaa agt ctc tca ggg 3567Lys Gln Ser Val Val Ser Thr Lys Leu Val Asn Glu Ser Leu Ser Gly107510801085gct aca gtg cgt gac gaa aag att aat acg aag tcg atg cga aat agc 3615Ala Thr Val Arg Asp Glu Lys Ile Asn Thr Lys Ser Met Arg Asn Ser109010951100tca gag gat gaa gag tgc atg act gag aag aga tgt agc cat tat agc 3663Ser Glu Asp Glu Glu Cys Met Thr Glu Lys Arg Cys Ser His Tyr Ser110511101115aca gca aca aga gat atc gaa aag act att agt ggc ata aaa aag aaa 3711Thr Ala Thr Arg Asp Ile Glu Lys Thr Ile Ser Gly Ile Lys Lys Lys112011251130tac aag aag caa gtg caa aag ctt gta caa gag cat gag gaa aag aaa 3759Tyr Lys Lys Gln Val Gln Lys Leu Val Gln Glu His Glu Glu Lys Lys1135 114011451150atg gag ctg tta aat atg tat gca gac aag aag cag aaa ctt gaa act 3807Met Glu Leu Leu Asn Met Tyr Ala Asp Lys Lys Gln Lys Leu Glu Thr115511601165agt aaa agt gtg gaa gca gca gta att cgt att acc tgt tca cgg acc 3855Ser Lys Ser Val Glu Ala Ala Val Ile Arg Ile Thr Cys Ser Arg Thr117011751180agt act caa gtg ggt gat ctc aaa ctg ctg gat cat aat tat gaa aga 3903Ser Thr Gln Val Gly Asp Leu Lys Leu Leu Asp His Asn Tyr Glu Arg118511901195aag ttt gat gaa atc aaa agt gag aaa aat gaa tgc ctc aaa agt ctg 3951Lys Phe Asp Glu Ile Lys Ser Glu Lys Asn Glu Cys Leu Lys Ser Leu120012051210gag caa atg cac gag gtt gca aag aag aag ttg gct gag gat gaa gcc 3999Glu Gln Met His Glu Val Ala Lys Lys Lys Leu Ala Glu Asp Glu Ala1215 122012251230tgt tgg att aat cgg ata aag agc tgg gca gct aaa tta aaa gtt tgt 4047Cys Trp Ile Asn Arg Ile Lys Ser Trp Ala Ala Lys Leu Lys Val Cys123512401245gtt ccc att caa agt ggc aat aac aag cat ttt agt ggt tca tca aac 4095Val Pro Ile Gln Ser Gly Asn Asn Lys His Phe Ser Gly Ser Ser Asn125012551260att tcc caa aat gct cct gat gta caa att tgc aat aat gct aac gtt 4143Ile Ser Gln Asn Ala Pro Asp Val Gln Ile Cys Asn Asn Ala Asn Val126512701275gaa gct act tac gct gat acg aat tgc atg gct tcc aag gtt aat caa 4191Glu Ala Thr Tyr Ala Asp Thr Asn Cys Met Ala Ser Lys Val Asn Gln128012851290gtg cca gaa gca gaa aac aca tta gga acc atg tcg ggt ggc agc act 4239Val Pro Glu Ala Glu Asn Thr Leu Gly Thr Met Ser Gly Gly Ser Thr1295 130013051310caa caa gtt cat gaa atg gtg gat gta aga aat gac gag aca atg gat 4287Gln Gln Val His Glu Met Val Asp Val Arg Asn Asp Glu Thr Met Asp131513201325gtc tca gct ttg tct cgt gaa cag ctt aca aag agc cag tcc aat gag 4335Val Ser Ala Leu Ser Arg Glu Gln Leu Thr Lys Ser Gln Ser Asn Glu133013351340cac gct tct atc act gtg cct gag att ttg att cct gct gac tgt caa 4383His Ala Ser Ile Thr Val Pro Glu Ile Leu Ile Pro Ala Asp Cys Gln134513501355gag gaa ttt gcg gcc ttg aac gtg cat ttg tca gaa gac cag aat tgt 4431Glu Glu Phe Ala Ala Leu Asn Val His Leu Ser Glu Asp Gln Asn Cys136013651370gac aga ata aca tct gcg gca tca gat gaa gat gtt tca tca agg gtg 4479Asp Arg Ile Thr Ser Ala Ala Ser Asp Glu Asp Val Ser Ser Arg Val1375 138013851390cca gag gta tcc cag tca ctc gaa aat ctt tct gcc tcc ccc gag ttt 4527Pro Glu Val Ser Gln Ser Leu Glu Asn Leu Ser Ala Ser Pro Glu Phe139514001405tct cta aat aga gag gag gct ttg gtt aca aca gaa aat aga aga aca 4575Ser Leu Asn Arg Glu Glu Ala Leu Val Thr Thr Glu Asn Arg Arg Thr141014151420agt cat gtg ggt ttt gat act gat aac att ttg gac cag cag aat aga 4623Ser His Val Gly Phe Asp Thr Asp Asn Ile Leu Asp Gln Gln Asn Arg142514301435gaa gat tgt tct ctt gac caa gag att cct gac gag tta gcg atg cct 4671Glu Asp Cys Ser Leu Asp Gln Glu Ile Pro Asp Glu Leu Ala Met Pro144014451450gtg caa cat ctt gcg tct gtg gta gag act agg ggt gct gct gaa tct 4719Val Gln His Leu Ala Ser Val Val Glu Thr Arg Gly Ala Ala Glu Ser1455 146014651470gat cag tat ggt caa gat ata tgt cct atg cct tct tca ctg gct gga 4767Asp Gln Tyr Gly Gln Asp Ile Cys Pro Met Pro Ser Ser Leu Ala Gly147514801485aag caa cct gac cca gca gca aac act gag agc gaa aat ctt gaa gaa 4815Lys Gln Pro Asp Pro Ala Ala Asn Thr Glu Ser Glu Asn Leu Glu Glu149014951500gca att gag cct cag tct gct ggt tca gaa aca gta gag act act gat 4863Ala Ile Glu Pro Gln Ser Ala Gly Ser Glu Thr Val Glu Thr Thr Asp150515101515ttt gct gca tca cat cag ggt gat caa gtt aca tgt cct ttg cta tct 4911Phe Ala Ala Ser His Gln Gly Asp Gln Val Thr Cys Pro Leu Leu Ser152015251530tca ccg act gga aat cag cct gcg cca gaa gca aat att gaa ggc caa 4959Ser Pro Thr Gly Asn Gln Pro Ala Pro Glu Ala Asn Ile Glu Gly Gln1535 154015451550aat atc aac aca tca gct gag ccc cat gta gcg ggt cca gat gca gta 5007Asn Ile Asn Thr Ser Ala Glu Pro His Val Ala Gly Pro Asp Ala Val155515601565gag agt ggt gat tat gca gta ata gat cag gaa aca atg ggt gct cag 5055Glu Ser Gly Asp Tyr Ala Val Ile Asp Gln Glu Thr Met Gly Ala Gln157015751580gat gca tgc tct ctg cca tct gga tcg gtt gga act cag tct gac cta 5103Asp Ala Cys Ser Leu Pro Ser Gly Ser Val Gly Thr Gln Ser Asp Leu158515901595gga gca aac att gag ggt caa aat gtc aca aca gtg gct caa ctt ccc 5151Gly Ala Asn Ile Glu Gly Gln Asn Val Thr Thr Val Ala Gln Leu Pro160016051610aca gat gga tca gat gca gtt gta acc ggt gga tct cct gta tca gat 5199Thr Asp Gly Ser Asp Ala Val Val Thr Gly Gly Ser Pro Val Ser Asp1615 162016251630cag tgt gcc cag gat gca tct cct atg cca tta tct tcg cct gga aat 5247Gln Cys Ala Gln Asp Ala Ser Pro Met Pro Leu Ser Ser Pro Gly Asn163516401645cac cct gat aca gca gtt aat atc gag ggt tta gat aac aca tca gta 5295His Pro Asp Thr Ala Val Asn Ile Glu Gly Leu Asp Asn Thr Set Val165016551660gct gag cct cat ata agt gga tca gat gca tgt gaa atg gaa att tca 5343Ala Glu Pro His Ile Ser Gly Ser Asp Ala Cys Glu Met Glu Ile Ser166516701675gaa cct ggt ccc caa gta gag cgg tca acc ttt gca aat ctt ttc cat 5391Glu Pro Gly Pro Gln Val Glu Arg Ser Thr Phe Ala Asn Leu Phe His168016851690gaa ggt ggc gtg gag cat tca gca ggt gta aca gct ctt gtt cca tca 5439Glu Gly Gly Val Glu His Ser Ala Gly Val Thr Ala Leu Val Pro Ser1695 170017051710ctt ctt aac aat ggt acg gaa cag att gcc gtt caa cct gtt cct caa 5487Leu Leu Asn Asn Gly Thr Glu Gln Ile Ala Val Gln Pro Val Pro Gln171517201725ata cct ttc cct gtg ttc aac gac ccg ttt ctg cat gaa ctg gag aag 5535Ile Pro Phe Pro Val Phe Asn Asp Pro Phe Leu His Glu Leu Glu Lys
173017351740ttg cgg aga gaa tca gag aac tca aag aag act ttt gaa gaa aaa aaa 5583Leu Arg Arg Glu Ser Glu Asn Ser Lys Lys Thr Phe Glu Glu Lys Lys174517501755tca atc ttg aaa gct gaa ctc gag agg aag atg gct gaa gta caa gca 5631Ser Ile Leu Lys Ala Glu Leu Glu Arg Lys Met Ala Glu Val Gln Ala176017651770gag ttt cga aga aaa ttt cat gag gta gaa gcc gag cat aac acc aga 5679Glu Phe Arg Arg Lys Phe His Glu Val Glu Ala Glu His Asn Thr Arg1775 178017851790acg aca aag ata gag aag gat aag aat ctt gtt ata atg aac aaa ctg 5727Thr Thr Lys Ile Glu Lys Asp Lys Asn Leu Val Ile Met Asn Lys Leu179518001805ttg gcg aat gcg ttc ttg tcc aaa tgt act gac aag aag gta tct ccc 5775Leu Ala Asn Ala Phe Leu Ser Lys Cys Thr Asp Lys Lys Val Ser Pro1810 1815 1820tca gga gct cca agg ggt aaa att cag cag cta gca cag aga gca gca 5823Ser Gly Ala Pro Arg Gly Lys Ile Gln Gln Leu Ala Gln Arg Ala Ala182518301835caa gtg agt gca ctg aga aat tac att gct cct cag cag ctt cag gca 5871Gln Val Ser Ala Leu Arg Asn Tyr Ile Ala Pro Gln Gln Leu Gln Ala184018451850tct tct ttt cct gct cct gct ctg gtt tcg gct cct ctg caa ctt cag 5919Ser Ser Phe Pro Ala Pro Ala Leu Val Ser Ala Pro Leu Gln Leu Gln1855 186018651870caa tca tca ttt cct gct cct ggt ccg gct cct ctg cag cct cag gca 5967Gln Ser Ser Phe Pro Ala Pro Gly Pro Ala Pro Leu Gln Pro Gln Ala1875 18801885tct tcg ttt cct tct tca gtc tct cgt cca tca gcc ctt ctt ctg aat 6015Ser Ser Phe Pro Ser Ser Val Ser Arg Pro Ser Ala Leu Leu Leu Asn189018951900ttt gcg gtc tgt cca atg cct cag ccc aga cag cct ctc ata tcc aac 6063Phe Ala Val Cys Pro Met Pro Gln Pro Arg Gln Pro Leu Ile Ser Asn190519101915ata gct cca act cca tca gtt act cct gca aca aat cca ggt ctg cgt 6111Ile Ala Pro Thr Pro Ser Val Thr Pro Ala Thr Asn Pro Gly Leu Arg192019251930tct cct gca cca cac cta aac tca tat aga cca tcc tct tca act ccc 6159Ser Pro Ala Pro His Leu Asn Ser Tyr Arg Pro Ser Ser Ser Thr Pro1935 194019451950gtc gcc aca gct act cca acc tcg tca gtg cct cct caa gct ttg aca 6207Val Ala Thr Ala Thr Pro Thr Ser Ser Val Pro Pro Gln Ala Leu Thr195519601965tat tca gct gtg tca att cag cag cag caa gaa caa caa ccg caa cag 6255Tyr Ser Ala Val Ser Ile Gln Gln Gln Gln Glu Gln Gln Pro Gln Gln197019751980agc ttg agc agt gga ttg cag agc aac aat gaa gtg gtt tgt ctt tct 6303Ser Leu Ser Ser Gly Leu Gln Ser Asn Asn Glu Val Val Cys Leu Ser198519901995gac gac gag tgacctaaga ggagagatgg ttagggtctt agttattgat 6352Asp Asp Glu2000ttttagagag ttaataatag tatatatata tatgtataag taggttacct aatctctgtc 6412gttaatctaa tttagtgagt caggaaccga ctcgttggct aaggtctctc cttttgaaac 6472gcaacgttct actttcatgt atataaatac agtctgatca cacaacacaa attgatgatt 6532gaaaatacta ctgatttaac ttaaaaaaaa aaaaaaaaa6571<210>3<211>2001<212>PRT<213>擬南芥<400>3Met Lys Lys Asp Glu Lys Ile Gly Leu Thr Gly Arg Thr Ile Tyr Thr1 5 10 15Arg Ser Leu Ala Ala Ser Ile Pro Ala Ser Val Glu Gln Glu Thr Pro20 25 30Gly Leu Arg Arg Ser Ser Arg Gly Thr Pro Ser Thr Lys Val Ile Thr35 40 45Pro Ala Ser Ala Thr Arg Lys Ser Glu Arg Leu Ala Pro Ser Pro Ala50 55 60Ser Val Ser Lys Lys Ser Gly Gly Ile Val Lys Asn Ser Thr Pro Ser65 70 75 80Ser Leu Arg Arg Ser Asn Arg Gly Lys Thr Glu Val Ser Leu Gln Ser85 90 95Ser Lys Gly Ser Asp Asn Ser Ile Arg Lys Gly Asp Thr Ser Pro Asp100 105 110Ile Glu Gln Arg Lys Asp Ser Val Glu Glu Ser Thr Asp Lys Ile Lys115 120 125Pro Ile Met Ser Ala Arg Ser Tyr Arg Ala Leu Phe Arg Gly Lys Leu
130 135 140Lys Glu Ser Glu Ala Leu Val Asp Ala Ser Pro Asn Glu Glu Glu Leu145 150 155 160Val Val Val Gly Cys Ser Arg Arg Ile Pro Ala Gly Asn Asp Asp Val165 170 175Gln Gly Lys Thr Asp Cys Pro Pro Pro Ala Asp Ala Gly Ser Lys Arg180 185 190Leu Pro Val Asp Glu Thr Ser Leu Asp Lys Gly Thr Asp Phe Pro Leu195 200 205Lys Ser Val Thr Glu Thr Glu Lys Ile Val Leu Asp Ala Ser Pro Ile210 215 220Val Glu Thr Gly Asp Asp Ser Val Ile Gly Ser Pro Ser Glu Asn Leu225 230 235 240Glu Thr Gln Lys Leu Gln Asp Gly Lys Thr Asp Cys Ser Pro Pro Ala245 250 255Asn Ala Glu Ser Lys Thr Leu Pro Val Gly Glu Thr Ser Leu Glu Lys260 265 270Glu Tyr Pro Gln Lys Phe Gln Asp Asp Asn Thr Asp Cys Leu Pro Pro275 280 285Ala Asn Ala Glu Ser Lys Arg Leu Pro Val Gly Glu Thr Ser Leu Glu290 295 300Lys Asp Thr Asp Phe Pro Leu Lys Ser Thr Thr Glu Thr Gly Lys Met305 310 315 320Val Leu Tyr Ala Ser Pro Ile Val Glu Thr Arg Asp Asp Ser Val Ile325 330 335Cys Ser Pro Ser Thr Asn Leu Glu Thr Gln Lys Leu Leu Val Ser Lys340 345 350Thr Gly Leu Glu Thr Asp Ile Val Leu Pro Leu Lys Arg Lys Arg Asp355 360 365Thr Ala Glu Ile Glu Leu Asp Ala Cys Ala Thr Val Ala Asn Gly Asp370 375 380Asp His Val Met Ser Ser Asp Gly Val Ile Pro Ser Pro Ser Gly Cys385 390 395 400Lys Asn Asp Asn Arg Pro Glu Met Cys Asn Thr Cys Lys Lys Arg Gln405 410 415Lys Val Asn Gly Asp Cys Gln Asn Arg Ser Val Cys Ser Cys Ile Val420 425 430Gln Pro Val Glu Glu Ser Asp Asn Val Thr Gln Asp Met Lys Glu Thr435 440 445Gly Pro Val Thr Ser Arg Glu Tyr Glu Glu Asn Gly Gln Ile Gln His450 455 460Gly Lys Ser Ser Asp Pro Lys Phe Tyr Ser Ser Val Tyr Pro Glu Tyr465 470 475 480Trp Val Pro Val Gln Leu Ser Asp Val Gln Leu Glu Gln Tyr Cys Gln485 490 495Thr Leu Phe Ser Lys Ser Leu Ser Leu Ser Ser Leu Ser Lys Ile Asp500 505 510Leu Gly Ala Leu Glu Glu Thr Leu Asn Ser Val Arg Lys Thr Cys Asp515 520 525His Pro Tyr Val Met Asp Ala Ser Leu Lys Gln Leu Leu Thr Lys Asn530 535 540Leu Glu Leu His Glu Ile Leu Asp Val Glu Ile Lys Ala Ser Gly Lys545 550 555 560Leu His Leu Leu Asp Lys Met Leu Thr His Ile Lys Lys Asn Gly Leu565 570 575Lys Ala Val Val Phe Tyr Gln Ala Thr Gln Thr Pro Glu Gly Leu Leu580 585 590Leu Gly Asn Ile Leu Glu Asp Phe Val Gly Gln Arg Phe Gly Pro Lys595 600 605Ser Tyr Glu His Gly Ile Tyr Ser Ser Lys Lys Asn Ser Ala Ile Asn610 615 620Asn Phe Asn Lys Glu Ser Gln Cys Cys Val Leu Leu Leu Glu Thr Arg625 630 635 640Ala Cys Ser Gln Thr Ile Lys Leu Leu Arg Ala Asp Ala Phe Ile Leu645 650 655Phe Gly Ser Ser Leu Asn Pro Ser His Asp Val Lys His Val Glu Lys660 665 670Ile Lys Ile Glu Ser Cys Ser Glu Arg Thr Lys Ile Phe Arg Leu Tyr675 680 685Ser Val Cys Thr Val Glu Glu Lys Ala Leu Ile Leu Ala Arg Gln Asn690 695 700Met Arg Gln Asn Lys Ala Val Glu Asn Leu Asn Arg Ser Leu Thr His705 710 715 720Ala Leu Leu Met Trp Gly Ala Ser Tyr Leu Phe Asp Lys Leu Asp His725 730 735Phe His Ser Ser Glu Thr Pro Asp Ser Gly Val Ser Phe Glu Gln Ser740 745 750Ile Met Asp Gly Val Ile His Glu Phe Ser Ser Ile Leu Ser Ser Lys755 760 765Gly Gly Glu Glu Asn Glu Val Lys Leu Cys Leu Leu Leu Glu Ala Lys770 775 780His Ala Gln Gly Thr Tyr Ser Ser Asp Ser Thr Leu Phe Gly Glu Asp785 790 795 800His Ile Lys Leu Ser Asp Glu Glu Ser Pro Asn Ile Phe Trp Ser Lys805 810 815Leu Leu Gly Gly Lys Asn Pro Met Trp Lys Tyr Pro Ser Asp Thr Pro820 825 830Gln Arg Asn Arg Lys Arg Val Gln Tyr Phe Glu Gly Ser Glu Ala Ser835 840 845Pro Lys Thr Gly Asp Gly Gly Asn Ala Lys Lys Arg Lys Lys Ala Ser850 855 860Asp Asp Val Thr Asp Pro Arg Val Thr Asp Pro Pro Val Asp Asp Asp865 870 875 880Glu Arg Lys Ala Ser Gly Lys Asp His Met Gly Ala Leu Glu Ser Pro885 890 895Lys Val Ile Thr Leu Gln Ser Ser Cys Lys Ser Ser Gly Thr Asp Gly900 905 910Thr Leu Asp Gly Asn Asp Ala Phe Gly Leu Tyr Ser Met Gly Ser His915 920 925Ile Ser Gly Ile Pro Glu Asp Met Leu Ala Ser Gln Asp Trp Gly Lys930 935 940Ile Pro Asp Glu Ser Gln Arg Arg Leu His Thr Val Leu Lys Pro Lys945 950 955 960Met Ala Lys Leu Cys Gln Val Leu His Leu Ser Asp Ala Cys Thr Ser965 970 975Met Val Gly Asn Phe Leu Glu Tyr Val Ile Glu Asn His Arg Ile Tyr980 985 990Glu Glu Pro Ala Thr Thr Phe Gln Ala Phe Gln Ile Ala Leu Ser Trp99510001005Ile Ala Ala Leu Leu Val Lys Gln Ile Leu Ser His Lys Glu Ser Leu101010151020Val Arg Ala Asn Ser Glu Leu Ala Phe Lys Cys Ser Arg Val Glu Val025103010351040Asp Tyr Ile Tyr Ser Ile Leu Ser Cys Met Lys Ser Leu Phe Leu Glu104510501055His Thr Gln Gly Leu Gln Phe Asp Cys Phe Gly Thr Asn Ser Lys Gln106010651070Ser Val Val Ser Thr Lys Leu Val Asn Glu Ser Leu Ser Gly Ala Thr107510801085Val Arg Asp Glu Lys Ile Asn Thr Lys Ser Met Arg Asn Ser Ser Glu109010951100Asp Glu Glu Cys Met Thr Glu Lys Arg Cys Ser His Tyr Ser Thr Ala105111011151120Thr Arg Asp Ile Glu Lys Thr Ile Ser Gly Ile Lys Lys Lys Tyr Lys112511301135Lys Gln Val Gln Lys Leu Val Gln Glu His Glu Glu Lys Lys Met Glu114011451150Leu Leu Asn Met Tyr Ala Asp Lys Lys Gln Lys Leu Glu Thr Ser Lys115511601165Ser Val Glu Ala Ala Val Ile Arg Ile Thr Cys Ser Arg Thr Ser Thr117011751180Gln Val Gly Asp Leu Lys Leu Leu Asp His Asn Tyr Glu Arg Lys Phe185119011951200Asp Glu Ile Lys Ser Glu Lys Asn Glu Cys Leu Lys Ser Leu Glu Gln120512101215Met His Glu Val Ala Lys Lys Lys Leu Ala Glu Asp Glu Ala Cys Trp122012251230Ile Asn Arg Ile Lys Ser Trp Ala Ala Lys Leu Lys Val Cys Val Pro123512401245Ile Gln Ser Gly Asn Asn Lys His Phe Ser Gly Ser Ser Asn Ile Ser125012551260Gln Asn Ala Pro Asp Val Gln Ile Cys Asn Asn Ala Asn Val Glu Ala265127012751280Thr Tyr Ala Asp Thr Asn Cys Met Ala Ser Lys Val Asn Gln Val Pro128512901295Glu Ala Glu Asn Thr Leu Gly Thr Met Ser Gly Gly Ser Thr Gln Gln130013051310Val His Glu Met Val Asp Val Arg Asn Asp Glu Thr Met Asp Val Ser131513201325Ala Leu Ser Arg Glu Gln Leu Thr Lys Ser Gln Ser Asn Glu His Ala133013351340Ser Ile Thr Val Pro Glu Ile Leu Ile Pro Ala Asp Cys Gln Glu Glu345135013551360Phe Ala Ala Leu Asn Val His Leu Ser Glu Asp Gln Asn Cys Asp Arg136513701375Ile Thr Ser Ala Ala Ser Asp Glu Asp Val Ser Ser Arg Val Pro Glu138013851390Val Ser Gln Ser Leu Glu Asn Leu Ser Ala Ser Pro Glu Phe Ser Leu139514001405Asn Arg Glu Glu Ala Leu Val Thr Thr Glu Asn Arg Arg Thr Ser His141014151420Val Gly Phe Asp Thr Asp Asn Ile Leu Asp Gln Gln Asn Arg Glu Asp425143014351440Cys Ser Leu Asp Gln Glu Ile Pro Asp Glu Leu Ala Met Pro Val Gln144514501455His Leu Ala Ser Val Val Glu Thr Arg Gly Ala Ala Glu Ser Asp Gln146014651470Tyr Gly Gln Asp Ile Cys Pro Met Pro Ser Ser Leu Ala Gly Lys Gln147514801485Pro Asp Pro Ala Ala Asn Thr Glu Ser Glu Asn Leu Glu Glu Ala Ile149014951500Glu Pro Gln Ser Ala Gly Ser Glu Thr Val Glu Thr Thr Asp Phe Ala505151015151520Ala Ser His Gln Gly Asp Gln Val Thr Cys Pro Leu Leu Ser Ser Pro152515301535Thr Gly Asn Gln Pro Ala Pro Glu Ala Asn Ile Glu Gly Gln Asn Ile154015451550Asn Thr Ser Ala Glu Pro His Val Ala Gly Pro Asp Ala Val Glu Ser155515601565Gly Asp Tyr Ala Val Ile Asp Gln Glu Thr Met Gly Ala Gln Asp Ala157015751580Cys Ser Leu Pro Ser Gly Ser Val Gly Thr Gln Ser Asp Leu Gly Ala585159015951600Asn Ile Glu Gly Gln Asn Val Thr Thr Val Ala Gln Leu Pro Thr Asp160516101615Gly Ser Asp Ala Val Val Thr Gly Gly Ser Pro Val Ser Asp Gln Cys162016251630Ala Gln Asp Ala Ser Pro Met Pro Leu Ser Ser Pro Gly Asn His Pro163516401645Asp Thr Ala Val Asn Ile Glu Gly Leu Asp Asn Thr Ser Val Ala Glu165016551660Pro His Ile Ser Gly Ser Asp Ala Cys Glu Met Glu Ile Ser Glu Pro665167016751680Gly Pro Gln Val Glu Arg Ser Thr Phe Ala Asn Leu Phe His Glu Gly168516901695Gly Val Glu His Ser Ala Gly Val Thr Ala Leu Val Pro Ser Leu Leu1700 17051710Asn Asn Gly Thr Glu Gln Ile Ala Val Gln Pro Val Pro Gln Ile Pro171517201725Phe Pro Val Phe Asn Asp Pro Phe Leu His Glu Leu Glu Lys Leu Arg173017351740Arg Glu Ser Glu Asn Ser Lys Lys Thr Phe Glu Glu Lys Lys Ser Ile745175017551760Leu Lys Ala Glu Leu Glu Arg Lys Met Ala Glu Val Gln Ala Glu Phe176517701775Arg Arg Lys Phe His Glu Val Glu Ala Glu His Asn Thr Arg Thr Thr178017851790Lys Ile Glu Lys Asp Lys Asn Leu Val Ile Met Asn Lys Leu Leu Ala179518001805Asn Ala Phe Leu Ser Lys Cys Thr Asp Lys Lys Val Ser Pro Ser Gly181018151820Ala Pro Arg Gly Lys Ile Gln Gln Leu Ala Gln Arg Ala Ala Gln Val825183018351840Ser Ala Leu Arg Asn Tyr Ile Ala Pro Gln Gln Leu Gln Ala Ser Ser184518501855Phe Pro Ala Pro Ala Leu Val Ser Ala Pro Leu Gln Leu Gln Gln Ser186018651870Ser Phe Pro Ala Pro Gly Pro Ala Pro Leu Gln Pro Gln Ala Ser Ser187518801885Phe Pro Ser Ser Val Ser Arg Pro Ser Ala Leu Leu Leu Asn Phe Ala189018951900Val Cys Pro Met Pro Gln Pro Arg Gln Pro Leu Ile Ser Asn Ile Ala905191019151920Pro Thr Pro Ser Val Thr Pro Ala Thr Asn Pro Gly Leu Arg Ser Pro192519301935Ala Pro His Leu Asn Ser Tyr Arg Pro Ser Ser Ser Thr Pro Val Ala194019451950Thr Ala Thr Pro Thr Ser Ser Val Pro Pro Gln Ala Leu Thr Tyr Ser195519601965Ala Val Ser Ile Gln Gln Gln Gln Glu Gln Gln Pro Gln Gln Ser Leu197019751980Ser Ser Gly Leu Gln Ser Asn Asn Glu Val Val Cys Leu Ser Asp Asp9851990 1995 2000Glu<210>4<211>21<212>DNA<213>人工序列<220><223>人工序列描述合成寡核苷酸<400>4catctacggc aatgtaccag c 21<210>5<211>21<212>DNA<213>人工序列<220><223>人工序列描述合成寡核苷酸<400>5gatgggaatt ggctgagtgg c 21<210>6<211>21<212>DNA<213>人工序列<220><223>人工序列描述合成寡核苷酸<400>6cagttccaaa cgtaaaacgg c 21<210>7<211>15<212>DNA<213>人工序列<220><223>人工序列描述合成寡核苷酸<400>7ntcgastwts gwgtt 15<210>8<211>16<212>DNA<213>人工序列<220><223>人工序列描述合成寡核苷酸<400>8ngtcgaswga nawgaa 16<210>9<211>16<212>DNA<213>人工序列<220><223>人工序列描述合成寡核苷酸<400>9wgtgnagwan canaga 16<210>10<211>16<212>DNA<213>人工序列<220><223>人工序列描述合成寡核苷酸<400>10wggwancwga wangca 16<210>11<211>16<212>DNA<213>人工序列<220><223>人工序列描述合成寡核苷酸<400>11wcgwwgawca ngncga 16<210>12<211>16<212>DNA<213>人工序列<220><223>人工序列描述合成寡核苷酸<400>12wgcnagtnag wanaag 16<210>13<211>16<212>DNA<213>人工序列<220><223>人工序列描述合成寡核苷酸<400>13awgcangncw ganata 16<210>14<211>24<212>DNA<213>人工序列<220><223>人工序列描述合成寡核苷酸<400>14ctgtacatac tgagtacaat cgga 24<210>15<211>25<212>DNA<213>人工序列<220><223>人工序列描述合成寡核苷酸<400>15gcttcaattc ctgcctcagt tgaac25<210>16<211>24<212>DNA<213>人工序列<220><223>人工序列描述合成寡核苷酸<400>16ctctacgtgc ttaacatcat gcga 24<210>17<211>25<212>DNA<213>人工序列<220><223>人工序列描述合成寡核苷酸<400>17ccagcttctg ctactagaaa gtcag25<210>18<211>25<212>DNA<213>人工序列<220><223>人工序列描述合成寡核苷酸<400>18ctggagttgc atgaaatcct ggatg25<210>19<211>25<212>DNA<213>人工序列<220><223>人工序列描述合成寡核苷酸<400>19gctctttgta agctgttcac gagac 25<210>20<211>24<212>DNA<213>人工序列<220><223>人工序列描述合成寡核苷酸<400>20tcgcatgatg ttaagcacgt agag24<210>21<211>25<212>DNA<213>人工序列<220><223>人工序列描述合成寡核苷酸<400>21gagtactggt ccgtgaacag gtaat 25<210>22<211>25<212>DNA<213>人工序列<220><223>人工序列描述合成寡核苷酸<400>22atgcttgcac aagcatggtc ggaaa 25<210>23<211>25<212>DNA<213>人工序列<220><223>人工序列描述合成寡核苷酸<400>23tgcaacatcg tgcatttgct ccaga 25<210>24<211>25<212>DNA<213>人工序列<220><223>人工序列描述合成寡核苷酸<400>24cacaagcatg agtttttcct tccgg 25<210>25<211>25<212>DNA<213>人工序列<220><223>人工序列描述合成寡核苷酸<400>25ctgactttct agtagcagaa gctgg 25<210>26<211>519<212>DNA<213>甘藍(lán)<220><223>seg1-23<400>26gaattcctgn nttacggcat ccttaataga ctgttcaaat ggaactcctg aacctggggt 60tccactccca tggaagtgtt ccagcttatc aaataaatat gatgcccccc acatgagcaa 120tgcatgtgtg agaggacggt ttaggttctc tagaggctta ttttgcctag caagaatcag 180ggttttttct tcaactgtaa acactgagta caaccggaaa atcttagttc tttcagaaca 240cgactcaacc tttatcttct ctaagagctt aacgtcatgc gatggattca ggctgcttcc 300aaaaagtata aaagactcag cgcgtaagag tttaatgctt tgactacagg cacgtatttc 360cagcagcaga ataaaacact cactctcctt gttgaaattg tttatagcgt tcttcttcga 420gaggcagacc ccatgctcat aggaattttg accaaatctt tgcatcagaa aatcttcgag 480aatattacca agcagaagcc cctcagggct atgtattgc519<210>27<211>419<212>DNA<213>甘藍(lán)<220><223>seq1-27<400>27gaattcagga tcaaaagggt tgccggttgg agaaactggt ttagagaaag gctctgattt 60tcctgtggaa gtaactaagg atatagagaa gacagtggtt gattcatccc ccatggttga 120aactgaggat ggcagtgtta taggttcacc atccgagaat ccagaaccac aaaagcttcg 180tgacagtgaa actagcttgg aaaccgatat agacttggct ctgaaaagaa aaagagacac 240tgcagaaatt gtgatggatg catgtacaaa tgcagatgac cgcattatga gtactgatgg 300ggttattcct tttccacccg tgtgcacaaa tattaatcaa cccgaaaggt gtggcacatg 360tcaaaaacgg caaaagtaag aatttccgac tgttgtctgt cgttttgaaa ccatttgcc 419<210>28<211>467<212>DNA<213>甘藍(lán)<220><223>seq1-43<400>28gaattctcgt ccatactttc ttccgatgtt ggagaagaaa atgaaggcaa gctgtgtcta 60cttttggaag ccaagcatgc tcagggaagt tacagcactg atgctactct atttggtgaa 120gaacatgtca agttatcaga tgaaagtcca aatatgtttt ggtcaaagct gttgagtgga 180aagaacccta tgtggaaata ctgttcggat actcctcaaa ggagtcgaaa aagagtacgg 240catcttcagg gctatgagga gactaccaaa gttggcaatg gcggaaactt aaagaagaaa 300aagaaggctt cagatgatgt cacagtagat aacgctgaga gaaaagcctc tggaaaggat 360cacatgggta aaacagttca cttcctgctc ctttacctct agtgttcatt gaatgttcca 420tttactttgc ttactatctt tccttcaggg catttggagt caccaaa 467<210>29<211>490<212>DNA<213>甘藍(lán)<220><223>seq1-47<400>29gaattcagct tttaaaactg atctctgctc acagataatt taagagtcag tgaaaattga 60gataaaacga accaaaactg gaggtaacag atactctgag aacaactaac cttttcttca 120taagtcttct ttgtgttctc tgattctctc cgcagcttct ccagttcatg ctgaaatggg 180tcactgaaca cagggaaagg tacttgagga acaggtggag tggcattctg tcccgtagca 240ttgttaagct gtgaagaaac aggagctgtt acacctgctg gaggctccac aacaccttca 300tcgacaacgt ctgcgtaaaa ggtattacca gattgtcagt ttctctggca aacacatacg 360ttatacttaa atgcaaaaga gcagttactg acttgcaaag gttggttgtt ctacttgagc 420atcaggttct gctacttcca tttcacatgc ttctgatcca gttgtgcgag gcgcagccat 480tgttgtgttg 490<210>30<211>515<212>DNA<213>甘藍(lán)<220><223>2-33<400>30tctagagaag aggtggatta tgtatattct tttctgtact gcatgaagag tctattcgtg 60gggcgcacac aaggtttcca agaaaagggt gaagaatgca tggctgagaa aagaggtagc 120cattatagct cagtaaccaa ggatgttgaa aagactatta gcgacatcaa aaagaaatgc 180agtaagagcc tgcataagct tgtacaaacc ctcgaggaag aaaagatgga cctgatgaat 240aggaatgctg tcaagaagca ggaacttcag aattgtaaaa aggtggaagc atcatttatt 300cgtgtcacct attcaggtat aaatactcag agcttacatg atgctctcca acggctggaa 360tgtacttttg aaagaaagtt tgatgatctc aaaggagagt tggatgaatg ccttgaaagt 420ttagagcaaa taaacgaggc tggaaagaag aagttggctg aagatgaagc ctgttggatt 480agtcggatag agaaatgggc acgagctgaa ttaag515<210>31<211>574<212>DNA<213>甘藍(lán)<220><223>seq2-37<400>31tctagaccaa actattaaac gctaaacata agaagattag atcactcgtc atcagagaga 60cagaccacat cattgctcct ctgcaatcca ctccccaagt tctgtggttg ttcttgctgc 120tgaataaacg catttgaata tggtaaaggg ttggagatga gaggttgtct tggttgaggc 180attgtgcagt acggagccga agcagtatga ttcctcagtg cgcttacttg tgttgctctc 240tgtgctagct gctggattct aactggagaa agaaaaaaag aaaaaaaagg tgttattatg 300acttcataac cttatatctt taaaaaacaa ttatgcttct attattcgaa cacttgccca 360ttggagttgc tgctgaggaa tgagaggaga ttctgctcgt acatttagac aagaacgcac 420tcgacaacag cttgttcttt ataacaagat tcttcctcgt ctgtaacttc gtctttctgg 480ctgcatgtac agcttgtacc tcatgaaact ttctctgata ctcttcttgt aattcagcta 540tcttcttctc gaatttagct ttcaagactg cttt 574<210>32<211>466<212>DNA<213>甘藍(lán)<220><223>seq2-53<400>32tctagattgt aattttaaat ttacaacaaa ttttgaaagg gtcagcgatg agtttgcaaa 60tctccgtgtt tcctccagca ttgctcagcc agttcaagaa cctgatcact tggcacaggt 120tggtttcttc ttgctttact ttggacacct gtttaatatt ggcctgtcaa atttacttat 180ccttttactt ctaaactgca aattctggtc tgcattgcat tgtgatatga aggtatctgg 240acccgcttca agcagagact atggggagga caggcagaat atgcaacaag ataaatcaca 300tgaccgaaag ttgtcatcga tgtatccaga gtattgggtt ccagtgcagc tatcagatgt 360acagatagag caatactgtc ggactctctt ctccaaatct tcatctcttt cttcgctgtc 420gaggactgat cctgttcgag ctcttgaaca aactctcagt tctgta466<210>33<211>417<212>DNA<213>甘藍(lán)<220><223>seq2-57<400>33tctagagcaa ttgaaaccta attccgattt tgcgcgggcc agagattctt cacggttgaa 60cttttgctta acgaaagaga ctgcaatcca aatctggaag tgcattatta agaacgtatt 120cagcaatatt cataaattat gcaacaatca aaggccttac gttgtggcct acaaagcatg 180gattttgtta gatattagta gctagtctaa ttcaagcaat taatggaagt ttctatccta 240tgactggaaa gttaaacatt cccacaaaag cagtgatgcc acagatgatg aagaagaaaa 300atgcatatac tatggaagtg aatgctatca taccacagct atctggaagg cctgcaatgt 360tgtagctggc tctttgcaga cacggtggtt gtcaataata tattcaagaa ctttttc41權(quán)利要求
1.含有編碼蛋白質(zhì)的可讀框的DNA,該蛋白質(zhì)的特征為氨基酸序列含有與SEQ ID NO3的對(duì)比成分序列有40%或更多一致性的、至少有150個(gè)氨基酸殘基的成分序列。
2.根據(jù)權(quán)利要求1的DNA,其含有編碼具有式R1-R2-R3的蛋白質(zhì)的可讀框,其中--R1、R2和R3構(gòu)成由從氨基酸殘基組Gly、Ala、Val、Leu、Ile、Phe、Pro、Ser、Thr、Cys、Met、Trp、Tyr、Asn、Gln、Asp、Glu、Lys、Arg和His中獨(dú)立選擇的氨基酸殘基組成的成分序列,--R1和R3由0-3000個(gè)氨基酸殘基獨(dú)立組成;--R2由至少150個(gè)氨基酸殘基組成;且--R2與SEQ ID NO3的對(duì)比成分序列至少有40%相同。
3.根據(jù)權(quán)利要求1的DNA,其含有編碼一個(gè)或多個(gè)SWI2/SNF2樣ATP酶/解旋酶基元。
4.根據(jù)權(quán)利要求1的DNA,其含有編碼蛋白質(zhì)的可讀框,該蛋白質(zhì)具有SEQ ID NO3的第478-490、584-600、617-630、654-668、676-690、718-734、776-788、1222-1233、1738-1749或1761-1770位氨基酸限定的成分序列。
5.根據(jù)權(quán)利要求1的DNA,其中可讀框編碼的蛋白的特征為SEQ IDNO3的氨基酸序列,或者用氨基酸殘基K替代SEQ ID NO3第705位的M或用氨基酸殘基D替代SEQ ID NO3第1219位的E的等位氨基酸序列。
6.根據(jù)權(quán)利要求1的DNA,其以SEQ ID NO1或SEQ ID NO2的核苷酸序列為特征。
7.根據(jù)權(quán)利要求1的DNA,其中與其轉(zhuǎn)錄的mRNA互補(bǔ)的RNA的表達(dá),解除了轉(zhuǎn)基因標(biāo)記基因的沉默。
8.根據(jù)權(quán)利要求1-7之任意一項(xiàng)的可讀框編碼的蛋白質(zhì)。
9.產(chǎn)生根據(jù)權(quán)利要求1的DNA的方法,包括-篩選DNA文庫中能夠與SEQ ID NO1或SEQ ID NO2所限定DNA的片段雜交的克隆,其中所述片段長(zhǎng)度至少為15個(gè)氨基酸;-將雜交克隆測(cè)序;-純化含有編碼蛋白質(zhì)的可讀框的克隆的載體DNA,該蛋白質(zhì)的特征為氨基酸序列含有與SEQ ID NO3具有40%或更多一致性的、至少有150個(gè)氨基酸殘基的成分序列;-任選地對(duì)純化DNA進(jìn)一步處理。
10.多聚酶鏈?zhǔn)椒磻?yīng),其中使用的至少一個(gè)寡核苷酸含有代表SEQ IDNO1或SEQ ID NO2的15個(gè)或更多堿基對(duì)的核苷酸序列。
全文摘要
本發(fā)明涉及編碼參與基因沉默的蛋白的DNA。相關(guān)基因可從不同來源如哺乳動(dòng)物或植物的細(xì)胞中分離,其編碼蛋白的特征為氨基酸序列含有與SEQ ID NO:3的對(duì)比成分序列有40%或更多一致性的、至少有150個(gè)氨基酸殘基的成分序列。進(jìn)一步公開的是一種分離根據(jù)本發(fā)明的DNA的方法。
文檔編號(hào)C12N15/09GK1364194SQ00809325
公開日2002年8月14日 申請(qǐng)日期2000年6月21日 優(yōu)先權(quán)日1999年6月23日
發(fā)明者Y·羽部, O·米特爾斯坦施德, P·阿麥德歐, J·帕茨克斯基 申請(qǐng)人:辛根塔參與股份公司