You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Having suggested several experimental pull requests, some of which are referenced below, I have the following thoughts for changing the API to letters to help make it ready for v1.0.0 and to make it extensible in future.
This approach will allow users to efficiently get the parts of an email they are interested in. It will also allow users to deal more flexibly with any malformed emails they may be required to handle through user-supplied funcs. The functional options approach also allows for future extensibility.
Architecture
The architecture of the module would be changed to one where the configuration is separated from the main functions of the package to allow a convenient attachment point to hold configuration elements including custom funcs.
See my parser-options branch for a runnable test using the functional options approach.
The package might also be reorganised to be more accessible to other contributors and to provide sub-package tests where necessary, for example for expanded date and email address parsing.
The logic in package letters could largely be determined by the
configuration settings in parser/parser.go which will dispatch to
(depending on configuration) to headers, body and files and helpers.
The results will be collected in email/email.go
.
├── letters.go <- package docs and entry point
│
├── parser
│ ├── parser.go <- parser struct and dispatcher
│ ├── headers.go <- headers parser & decoder
│ ├── address.go <- address specific parsing
│ ├── date.go <- date specific parsing
│ ├── body.go <- body (text, etext, html) parser & decoder
│ └── files.go <- inline and attachment file parser & decoder
│ * per-file tests expected
│
├── email
│ └── email.go <- result struct and methods
│ place for future extensions eg writer
│ email_test.go might be useful
│
└── test * package level tests
└── testdata * test data
Adjustments to the main Email struct
Few adjustments are suggested to the main Email struct.
Consideration should be given to promoting the Received header as either a []string or even a slice of a new Received type to a defined Headers field. Received information is invaluable for tracking email origins, but also is a useful fallback if date information cannot be extracted.
The main adjustment is to define a shared type for inline and attached files and to use io.Readers instead of []byte to represent data, along the lines of the fileData experiment. Using io.Readers for the main data types provides some challenges but also can provide considerable improvements in processing speed and reductions in memory usage.
Customisation and Options
There are probably three areas where customisation might be useful. Customisation would be controlled through options on the configuration/parser struct which then are used to determine processing.
The customisation areas are:
1. Scope
While the parser may be expected to return the whole email by default, users may only with to work with the header, or the whole email except for the file (inline and attached) components. Scope should be set by an enum to ensure only one scope is selected even if multiple options are provided.
The option arguments might be:
WithHeadersOnly()
WithNoAttachments()
2. Parsers and Custom Funcs
Users may wish to have more relaxed or custom date parsers, or extend the address parsing capabilities. In both cases these are likely to be needed due to poorly written mail user agents or SMTP servers.
Where custom funcs may be provided by the user, these are expected to be closures. For example, a custom date func may want to generate date layouts as in net/mail.message.go within the closure, ahead of date parsing.
Extended date parsing may be optionally provided (if it is not used as standard) as suggested in issue #115.
The possible need for a relaxed address parser was discussed in issue #67.
The option arguments might be:
WithExtendedDateParsing()
WithCustomDateFunc(func (string) (time.Time, error))
WithCustomAddressFunc(func (s string) (*net/mail.Address, error))
These options would affect the config.dateFunc or config.addressFunc struct values as appropriate.
A special case is working with inline and attached files. As noted above, using io.Readers for the data of files can provide more efficient processing and can provide options for users to come up with parallel processing or novel file saving approaches -- for example streaming to a database. A func as follows might, for example, be used to save files to disk:
// Passing a closure to a writer Func can allow each file to be written to// directory on disk or perhaps written to a database. Here is a sketch:myClosure:=func(directorystring) (func(r io.Reader) error, error) {
fileNum:=0fileNameTpl:="file_from_email_%02d"directory:=directoryerr:=os.Mkdir(directory, 0755)
iferr!=nil {
returnnil, err
}
returnfunc(r io.Reader) error {
fileNum++path:=os.Path.Join(directory, fmt.Sprintf(fileNameTpl, fileNum))
f, err:=os.Create(path)
iferr!=nil {
returnerr
}
_, err=io.Copy(f, r)
returnerr
}
}
These options would affect the config.fileProcessingFunc struct value. (The experiment above writes the io.Readers to bytes for convenient testing).
It may be useful to provide the WithFileProcessingFunc with more information from the email from which it was made, such as the attachment filename and other details.
It would be simple to extend this functionality to allow users to process inline and attached files differently.
3. Decoding
It might be helpful to allow users to force the encoding type used to decode information where this is wrong specified in the email.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Having suggested several experimental pull requests, some of which are referenced below, I have the following thoughts for changing the API to letters to help make it ready for v1.0.0 and to make it extensible in future.
In Brief
In brief I suggest the public API is changed to:
Separating the parser instantiation from the
Parse
verb will allow the same configuration to be used for parsing more than one email.Different options may be given to
NewEmailParser
, such asThis approach will allow users to efficiently get the parts of an email they are interested in. It will also allow users to deal more flexibly with any malformed emails they may be required to handle through user-supplied funcs. The functional options approach also allows for future extensibility.
Architecture
The architecture of the module would be changed to one where the configuration is separated from the main functions of the package to allow a convenient attachment point to hold configuration elements including custom funcs.
See my parser-options branch for a runnable test using the functional options approach.
The package might also be reorganised to be more accessible to other contributors and to provide sub-package tests where necessary, for example for expanded date and email address parsing.
Adjustments to the main Email struct
Few adjustments are suggested to the main Email struct.
Consideration should be given to promoting the
Received
header as either a[]string
or even a slice of a newReceived
type to a definedHeaders
field. Received information is invaluable for tracking email origins, but also is a useful fallback if date information cannot be extracted.The main adjustment is to define a shared type for inline and attached files and to use
io.Reader
s instead of[]byte
to represent data, along the lines of the fileData experiment. Usingio.Reader
s for the main data types provides some challenges but also can provide considerable improvements in processing speed and reductions in memory usage.Customisation and Options
There are probably three areas where customisation might be useful. Customisation would be controlled through options on the configuration/parser struct which then are used to determine processing.
The customisation areas are:
1. Scope
While the parser may be expected to return the whole email by default, users may only with to work with the header, or the whole email except for the file (inline and attached) components. Scope should be set by an
enum
to ensure only one scope is selected even if multiple options are provided.The option arguments might be:
2. Parsers and Custom Funcs
Users may wish to have more relaxed or custom date parsers, or extend the address parsing capabilities. In both cases these are likely to be needed due to poorly written mail user agents or SMTP servers.
Where custom funcs may be provided by the user, these are expected to be closures. For example, a custom date func may want to generate date layouts as in net/mail.message.go within the closure, ahead of date parsing.
Extended date parsing may be optionally provided (if it is not used as standard) as suggested in issue #115.
The possible need for a relaxed address parser was discussed in issue #67.
The option arguments might be:
These options would affect the
config.dateFunc
orconfig.addressFunc
struct values as appropriate.A special case is working with inline and attached files. As noted above, using
io.Reader
s for the data of files can provide more efficient processing and can provide options for users to come up with parallel processing or novel file saving approaches -- for example streaming to a database. A func as follows might, for example, be used to save files to disk:The appropriate option arguments might be:
These options would affect the
config.fileProcessingFunc
struct value. (The experiment above writes the io.Readers to bytes for convenient testing).It may be useful to provide the
WithFileProcessingFunc
with more information from the email from which it was made, such as the attachment filename and other details.It would be simple to extend this functionality to allow users to process inline and attached files differently.
3. Decoding
It might be helpful to allow users to force the encoding type used to decode information where this is wrong specified in the email.
I look forward to any comments.
Beta Was this translation helpful? Give feedback.
All reactions