email body from a parsed email object in jython

I have an object.

    fp = open(self.currentEmailPath, "rb")
    p = email.Parser.Parser()
    self._currentEmailParsedInstance= p.parse(fp)
    fp.close()

self.currentEmailParsedInstance, from this object I want to get the body of an email, text only no HTML....

How do I do it?


something like this?

        newmsg=self._currentEmailParsedInstance.get_payload()
        body=newmsg[0].get_content....?

then strip the html from body. just what is that .... method to return the actual text... maybe I mis-understand you

        msg=self._currentEmailParsedInstance.get_payload()
        print type(msg)

output = type 'list'


the email

Return-Path:
Received: from xx.xx.net (example) by mxx3.xx.net (xxx)
id 485EF65F08EDX5E12 for xxx@xx.com; Thu, 23 Oct 2008 06:07:51 +0200
Received: from xxxxx2 (ccc) by example.net (ccc) (authenticated as xxxx.xxx@example.com) id 48798D4001146189 for example.example@example-example.com; Thu, 23 Oct 2008 06:07:51 +0200
From: "example"
To:
Subject: FW: example Date: Thu, 23 Oct 2008 12:07:45 +0800
Organization: example Message-ID: <001601c934c4$xxxx30$a9ff460a@xxx>
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary="----=_NextPart_000_0017_01C93507.F6F64E30"
X-Mailer: Microsoft Office Outlook 11
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138
Thread-Index: Ack0wLaumqgZo1oXSBuIpUCEg/wfOAABAFEA

This is a multi-part message in MIME format.

------=_NextPart_000_0017_01C93507.F6F64E30
Content-Type: multipart/alternative;
boundary="----=_NextPart_001_0018_01C93507.F6F64E30"

------=_NextPart_001_0018_01C93507.F6F64E30
Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: 7bit

From: example.example[mailto:example@example.com]
Sent: Thursday, October 23, 2008 11:37 AM
To: xxxx@example.com
Subject: S/I for example(B/L
No.:4357-0120-810.044)

Please find attached the example.doc),

Thanks.

B.rgds,

xxx xxx

------=_NextPart_001_0018_01C93507.F6F64E30
Content-Type: text/html;
charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

xmlns:o=3D"urn:schemas-microsoft-com:office:office" =
xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns:st1=3D"urn:schemas-microsoft-com:office:smarttags" =
xmlns=3D"http://www.w3.org/TR/REC-html40">

HTML STUFF till

------=_NextPart_001_0018_01C93507.F6F64E30--

------=_NextPart_000_0017_01C93507.F6F64E30
Content-Type: application/msword;
name="xxxx.doc"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename="xxxx.doc"

0M8R4KGxGuEAAAAAAAAAAAAAAAAAAAAAPgADAP7/CQAGAAAAAAAAAAAAAAABAAAAYAAAAAAAAAAA EAAAYgAAAAEAAAD+////AAAAAF8AAAD///////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////// ///////////////////////////////////////////////////////////////////////////s pcEAI2AJBAAA+FK/AAAAAAAAEAAAAAAABgAAnEIAAA4AYmpiaqEVoRUAAAAAAAAAAAAAAAAAAAAA AAAECBYAMlAAAMN/AADDfwAAQQ4AAAAAAAAPAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD//w8AAAAA AAAAAAD//w8AAAAAAAAAAAD//w8AAAAAAAAAAAAAAAAAAAAAAKQAAAAAAEYEAAAAAAAARgQAAEYE AAAAAAAARgQAAAAAAABGBAAAAAAAAEYEAAAAAAAARgQAABQAAAAAAAAAAAAAAFoEAAAAAAAA4hsA AAAAAADiGwAAAAAAAOIbAAA4AAAAGhwAAHwAAACWHAAARAAAAFoEAAAAAAAABzcAAEgBAADmHAAA FgAAAPwcAAAAAAAA/BwAAAAAAAD8HAAAAAAAAPwcAAAAAAAA/BwAAAAAAAD8HAAAAAAAAPwcAAAA AAAAMjYAAAIAAAA0NgAAAAAAADQ2AAAAAAAANDYAAAAAAAA0NgAAAAAAADQ2AAAAAAAANDYAACQA AABPOAAAaAIAALc6AACOAAAAWDYAAGkAAAAAAAAAAAAAAAAAAAAAAAAARgQAAAAAAABHLAAAAAAA AAAAAAAAAAAAAAAAAAAAAAD8HAAAAAAAAPwcAAAAAAAARywAAAAAAABHLAAAAAAAAFg2AAAAAAAA

------=_NextPart_000_0017_01C93507.F6F64E30--


I just want to get :

From: xxxx.xxxx [mailto:xxxx@example.com]
Sent: Thursday, October 23, 2008 11:37 AM
To: xxxx@example.com
Subject: S/I for xxxxx (B/L
No.:4357-0120-810.044)

Pls find attached the xxxx.doc),

Thanks.

B.rgds,

xxx xxx


not sure if the mail is malformed! seems if you get an html page you have to do this:

        parts=self._currentEmailParsedInstance.get_payload()
        print parts[0].get_content_type()
        ..._multipart/alternative_
        textParts=parts[0].get_payload()
        print textParts[0].get_content_type()
        ..._text/plain_
        body=textParts[0].get_payload()
        print body
        ...get the text without a problem!!

thank you so much Vinko.

So its kinda like dealing with xml, recursive in nature.


Asked by: Dominik649 | Posted: 24-09-2021






Answer 1

This will get you the contents of the message

self.currentEmailParsedInstance.get_payload()

As for the text only part you will have to strip HTML on your own, for example using BeautifulSoup.

Check this link for more information about the Message class the Parser returns. If you mean getting the text part of messages containing both HTML and plain text version of themselves, you can specify an index to get_payload() to get the part you want.

I tried with a different MIME email because what you pasted seems malformed, hopefully it got malformed when you edited it.

>>> parser = email.parser.Parser()
>>> message = parser.parse(open('/home/vinko/jlm.txt','r'))
>>> message.is_multipart()
True
>>> parts = message.get_payload()
>>> len(parts)
2
>>> parts[0].get_content_type()
'text/plain'
>>> parts[1].get_content_type()
'message/rfc822'
>>> parts[0].get_payload()
'Message Text'

parts will contain all parts of the multipart message, you can check their content types as shown and get only the text/plain ones, for instance.

Good luck.

Answered by: Kristian414 | Posted: 25-10-2021



Answer 2

ended up with this

        parser = email.parser.Parser()
        self._email = parser.parse(open('/home/vinko/jlm.txt','r'))
        parts=self._email.get_payload()
        check=parts[0].get_content_type()
        if check == "text/plain":
            return parts[0].get_payload()
        elif check == "multipart/alternative":
            part=parts[0].get_payload()
            if part[0].get_content_type() == "text/plain":
                return part[0].get_payload()
            else:
                return "cannot obtain the body of the email"
        else:
            return "cannot obtain the body of the email"

Answered by: Tess318 | Posted: 25-10-2021



Similar questions

python - Parsed string without u""

How can a string be created without u'' wrapper? I'm generating some string that I'd like to put into the array. E.g. STR = "" for i in some_array: STR += '\'\t&lt;img src="media/'+i+'" alt="" /&gt;\n,\' ' arr = ['i"m', 'the', 'array', STR, 'end'] # The result is: # arr = ['i"m', 'the', 'array', u'\'\t&lt;img src="media/1.jpg" alt="" /&gt;\n\', \'\t&lt;img src="media/2.jpg" a...


python - How can I store the results of parsed html?

I'm using Python's HTMLParser and BeautifulSoup to parse Yahoo finance data. There is a very nice package written to do this already but it doesn't get "tangbile price/book value", which is to say that it includes Goodwill and other intangibles in the calculation of book value. Hence, I'm forced to roll my own solution. It hasn't been pretty. Here's the code from BeautifulSoup import BeautifulSoup i...


python - I want to save my parsed HTML file into TXT file

I've parsed a web page showing article. I want to save the parsed data into text file, but my python shell shows an error like this: UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 107: ordinal not in range(128) and here is a part of my code search_result = urllib.urlopen(url) f = search_result.read() #xml parsing parsedResult = xml.dom.minido...


python - How To Save CSV parsed data into another csv file?

Please see the question below the code - import csv MY_FILE = "../data/sample_sfpd_incident_all.csv" def parse(raw_file, delimiter): opened_file = open(raw_file) csv_data = csv.reader(opened_file, delimiter=delimiter) parsed_data = [] fields = csv_data.next() for row in csv_data: parsed_data.append(dict(zip(fields, row))) opened_file.close() return parsed_data def ma...


python - How to test parsed from the web data?

I have a program which captures some data from the Internet and performors some actions on it. Very basic code: class ParsedData: data = None def _get(self): self.data = requests.get('...') def method1(self): self._get() def method2(self): self._get() How do I need to test method1 and method2? Do I need to provide my own...


Python 3 - write parsed XML data to .csv or .txt file

I am trying to log in to a website, parse the XML data and write the data to a .csv or .txt (either will work) file rather than print on the screen. The code below does everything except write all data to the .txt file - it will only write the last line of the output. My current code is: import urllib.request import time from lxml import etree import datetime import csv import sys today = datetime...


python - How can I write Parsed JSON to a CSV fie?

Note: Please see comments of ticked answer for resolution :) I'm writing some python which accesses an API and the exports the server's response (JSON) to a CSV file. The JSON is nested. This is the JSON response from the server (this is just a sample of the response): { "id":182774, "website_id":307842, "engine_provider":"Google", "engine_name":"United Kingdom...


python - link item to parsed hrefs in scrapy

Set-up Using scrapy I scrape housing ads. Per ad-overview page, I obtain a list with hrefs linking to individual ads. By means of a for-loop, the hrefs are sent to a second parser function to obtain housing characteristics per ad. def parse(self, response): # for href in list with hr...


python - How can I put data is parsed from excel to model?

I parsed excel and get row data in list. It is like [empty:'', text:’1', text:’1’, text:’40’, text:''] [empty:'', text:’2’, text:’5’, text:’23’, text:’●’] [empty:'', text:’3’, text:’9’, text:’52’, text:''] My excel(data.xlsx) is so list output is o...


python - how to yield a parsed item from one link with other parsed items from other links in the same item list

The problem is that I've been iterating from a list of places to scrape the latitude longitude and elevation. The thing is when I get what I scraped back I have no way to link it with my current df since the names that I iterated may have either been modified or skipped. I've managed to get the name of what I looked but since its parsed from an outside the link from the rest of the items it doesn't work properly.


python - What's the best Django search app?


How can I use a DLL file from Python?

What is the easiest way to use a DLL file from within Python? Specifically, how can this be done without writing any additional wrapper C++ code to expose the functionality to Python? Native Python functionality is strongly preferred over using a third-party library.


python - PubSub lib for c#

Is there a c# library which provides similar functionality to the Python PubSub library? I think it's kind of an Observer Pattern which allows me to subscribe for messages of a given topic instead of using events.


python - What is the best way to copy a list?

This question already has answers here:


python - Possible Google Riddle?

My friend was given this free google website optimizer tshirt and came to me to try and figure out what the front logo meant. t-shirt So, I have a couple of guesses as to what it means, but I was just wondering if there is something more. My first guess is that eac...


How do you check whether a python method is bound or not?

Given a reference to a method, is there a way to check whether the method is bound to an object or not? Can you also access the instance that it's bound to?


ssh - How to scp in Python?

What's the most pythonic way to scp a file in Python? The only route I'm aware of is os.system('scp "%s" "%s:%s"' % (localfile, remotehost, remotefile) ) which is a hack, and which doesn't work outside Linux-like systems, and which needs help from the Pexpect module to avoid password prompts unless you already have passwordless SSH set up to the remote host. I'm aware of Twisted'...


python - How do I create a new signal in pygtk

I've created a python object, but I want to send signals on it. I made it inherit from gobject.GObject, but there doesn't seem to be any way to create a new signal on my object.


python - What do I need to import to gain access to my models?

I'd like to run a script to populate my database. I'd like to access it through the Django database API. The only problem is that I don't know what I would need to import to gain access to this. How can this be achieved?


python - How do I edit and delete data in Django?

I am using django 1.0 and I have created my models using the example in the Django book. I am able to perform the basic function of adding data; now I need a way of retrieving that data, loading it into a form (change_form?! or something), EDIT it and save it back to the DB. Secondly how do I DELETE the data that's in the DB? i.e. search, select and then delete! Please show me an example of the code ...






Still can't find your answer? Check out these communities...



PySlackers | Full Stack Python | NHS Python | Pythonist Cafe | Hacker Earth | Discord Python



top