Forking Inputstream: Am I missing something

D

Dennis

Dear all,

This has been puzzling me all morning: There is reasonable elegant way (see example below) to log InputStreams and OutputStreams, without consuming the streams. However, I cannot find any reference to it or a good implementation of it anywhere. I tested it and it seems to work, but I have the nagging idea that I'm missing something, or I did miss all the references to it on the internet.

public class ForkedInputStream extends InputStream {

private InputStream m_source;
private OutputStream m_fork;

public ForkedInputStream(InputStream source, OutputStream fork) {
m_source = source;
m_fork = fork;
}

public int read() throws IOException {
int b = m_source.read();
if (b==-1) {
m_fork.flush();
} else {
m_fork.write(b);
}

return b;
}

public int available() throws IOException {
return m_source.available();
}

public void close() throws IOException {
m_fork.close();
}
}

You can feed in your original InputStream to ForkedInputStream and use the instance of ForkedInputStream as you would the original stream. The OutputStream will now contain the same information as read from the InputStream.

Off course you could enhance it with Threading, Buffers and a check to only write at certain (log) levels
This would take care of the only two issues I can think of, but the principle stays the same:
- A blocked output would block the read() operation.
- Writing to the OutputStream takes time, so you might not have it 'on' in every situation.
Or as the title suggests: Am I missing something?

Kind regards,
Dennis
Brains2B.org
 
M

markspace

Dennis said:
public void close() throws IOException {
m_fork.close();

You probably should close the input stream here too.


Aside from that the only comment I have is that this will be very
inefficient, since you don't override read( byte[], int, int ) and
therefore can't read more than one byte (via the method you do override,
read()) at a time.

Next step: override read(byte[]) and read(byte[],int,int).
 
D

dennis

Hi markspace,
Aside from that the only comment I have is that this will be very
inefficient, since you don't override read( byte[], int, int ) and
therefore can't read more than one byte (via the method you do override,
read()) at a time.
Next step: override read(byte[]) and read(byte[],int,int).

Thanks for your comments. I left the other read implementations out in the example for clarity but you are right they should be there. The close was just me being to hasty.

Kind regards,
Dennis
 
D

dennis

== java.io.SequenceInputStream, with more bugs

Hi,

Thanks for the pointer to SequenceInputStream. However SequenceInputStream just takes InputStream and there is no SequenceOutputStream. Which gets met back to the original question: what am I missing if 'forking' OutputStreams is not done.

What bugs did you find?

Kind regards,
Dennis
 
I

Ian Shef

== java.io.SequenceInputStream, with more bugs

I don't think so.
You might be correct about the bugs, but not about the equivalence.

From the Javadocs:
"A SequenceInputStream represents the logical concatenation of other input
streams."

A SequenceInputStream performs concatenation. A ForkedInputStream provides
what I would call a Tee. It provides input, and also echoes that input to
the designated OutputStream.

--------

The original poster might want to look at
http://commons.apache.org/io/apidocs/org/apache/commons/io/input/TeeInputStre
am.html

for another implementation of this idea. It has more features and possibly
fewer bugs.

The sources are open, for some definition of "open".

See
http://svn.apache.org/viewvc/commons/proper/io/trunk/src/java/org/apache/comm
ons/io/input/TeeInputStream.java?view=markup
 
M

Mike Schilling

Dennis said:
Dear all,

This has been puzzling me all morning: There is reasonable elegant
way (see example below) to log InputStreams and OutputStreams,
without consuming the streams. However, I cannot find any reference
to it or a good implementation of it anywhere. I tested it and it
seems to work, but I have the nagging idea that I'm missing
something, or I did miss all the references to it on the internet.

I'd probably extend FilterInputStream and just call super() for all the
processing of the "main" stream e,g,

public class TeeInputStream extends FilterInputStream
{
private OutputStream m_tee;

public TeeInputStream(InputStream in, OutputStream fork)
{
super(in);
}

public int read() throws IOException
{
int c = super.read();
if (b == -1)
m_tee.flush();
else
m_tee.write(c);
}

public void close() throws IOException
{
super.close();
m_tee.close();
}

// etc.
}
 
E

EJP

== java.io.SequenceInputStream, with more bugs

Sorry, I am crazy, it is actually more like tee(1). I wrote a
TeeInputStream at some point about 10 years ago, wonder where it is.
 
D

dennis

The original poster might want to look at
http://commons.apache.org/io/apidocs/org/apache/commons/io/input/TeeInputStre
am.html
for another implementation of this idea. It has more features and possibly
fewer bugs.

Hi Ian,

Thanks for the reference. This was helpfull. At least now I know it has been done before (and with no unexpected side effects).
I'm not sure yet if I want to use that one directly, or use my own implementation:
- One they are using ProxyInputStream which possibly allows for seek(),mark() and reset() of the underlying stream and this could mean part of the output would not be written multiple times or not at all. You either have to handle that or at least mark the output that there was a part not written or written multiple times. Especially when you use it for logging or storing xml-requests
- The license issue. If I use mine it will be done under the modified BSD-license so no restrictions to future users.

Kind regards,
Dennis
 
D

dennis

Hi Mike,

Thanks for the feedback. I looked into FilterInputStream, but did not want to use it while the underlying stream might support seek(), mark() and reset() and this will prevent data being written to the output or data being written multiple times. You either have to handle that or at least mark the output that there was a part not written or writtenmore then once. Especially when you use it for logging or storing xml-requests.

Kind regards,
Dennis..
 
M

Mike Schilling

dennis said:
Hi Mike,

Thanks for the feedback. I looked into FilterInputStream, but did not
want to use it while the underlying stream might support seek(),
mark() and reset() and this will prevent data being written to the
output or data being written multiple times. You either have to
handle that or at least mark the output that there was a part not
written or writtenmore then once. Especially when you use it for
logging or storing xml-requests.


I must be missing something. Whether you override FilterInpoutStrewam or
not, you still have the ability to override mark(), markSupported(), skip(),
and reset() yourself and do whatever you think is best.
 
L

Lew

dennis said:
- The license issue. If I use mine it will be done under the modified
BSD-license so no restrictions to future users.

Of what restrictions do you speak? The point of open-source licenses is that
they *remove* restrictions from future users.

The BSD license and the Apache license aren't very far apart in their
consequences.

I really am interested in what negative consequences you anticipate from the
Apache license. What are they?
 
J

John B. Matthews

[QUOTE="Lew said:
- The license issue. If I use mine it will be done under the modified
BSD-license so no restrictions to future users.

Of what restrictions do you speak? The point of open-source licenses
is that they *remove* restrictions from future users.

The BSD license and the Apache license aren't very far apart in their
consequences.

I really am interested in what negative consequences you anticipate from the
Apache license. What are they?[/QUOTE]

I would like to hear more on this, too. In particular, which modified
BSD-license?

<http://en.wikipedia.org/wiki/BSD_licenses>
 
D

dennis

Hi Lew,
Of what restrictions do you speak? The point of open-source licenses is that
they *remove* restrictions from future users.
The BSD license and the Apache license aren't very far apart in their
consequences.
I really am interested in what negative consequences you anticipate from the
Apache license. What are they?

First let me clarify something. With users in this case I mean people or customers using the sources and not only the compiled code. For the later no real restrictions do apply. For the people using and deriving of the sources restrictions do apply.
The most notable is from the GPL: All derived works should also be open source. http://www.opensource.org/licenses/gpl-2.0.php
The Apache license has just one restriction that has some of my customers worried: If you have a NOTICE in your software and made a derivitive work from any Apache code you have to mention Apache in that notice. http://www.apache.org/licenses/LICENSE-2.0.html
A comparable advertising restriction was in the original license from BSD as well, but was deleted in 1999. Hence the name 'modified' BSD license. Although me stil referring to 'modified' 11 years after it was done, might indicate I'm getting old ;-) http://www.opensource.org/licenses/bsd-license.php

I use some open source software in products I create for my customers. After I leave they should be able to extend or alter the work I have done or the open source software it relies on, without being required to release it as open source or advertise other organisations, not even my own.

Don't get me wrong. I have nothing against Apache, they have some great stuff out there and I use it regularly. If I need to fix something in their code I will supply it back to them for possible commitment in future releases. For my customers it is a different matter. They cannot derive works of it, without complying to the notice part of the license, or supplying it back to Apache. Something they might not be willing to do.

I hope this answers your questions.

Kind regards,
Dennis Groenendijk
 
D

dennis

Hi Mike,
I must be missing something. Whether you override FilterInpoutStrewam or
not, you still have the ability to override mark(), markSupported(), skip(),
and reset() yourself and do whatever you think is best.

Of course I can overwrite FilterInputStreams methods to do whatever I want. InputStream itself however had all the functionality in these methods I needed to make sure all bytes are read and written just once. It was thus more efficient to just use InputStream.

Kind regards,
Dennis
 
L

Lew

First let me clarify something. With users in this case I mean people or customers using the sources
and not only the compiled code. For the later no real restrictions do apply. For the people using and
deriving of the sources restrictions do apply.
The most notable is from the GPL: All derived works should also be open source.http://www.opensource.org/licenses/gpl-2.0.php

I asked about the Apache license.
The Apache license has just one restriction that has some of my customers worried:

Your customers are Nervous Nellies.

Um, IMHO.
If you have a NOTICE in your software and made a derivitive work from any Apache code you have
to mention Apache in that notice.http://www.apache.org/licenses/LICENSE-2..0.html
A comparable advertising restriction was in the original license from BSD as well,
but was deleted in 1999. Hence the name

They're not comparable at all. The BSD license referred to
advertising, as you mention; the Apache license does not. There is no
requirement in the Apache license to mention Apache in your
advertising. All you have to do is put copyright attributions (*not*
advertisements) in the NOTICE file and source code of your
application. They don't have to appear in the application interface,
splash screen or anywhere else but in the copyright notices. The BSD
license is similar in this respect - "Redistributions of source code
must retain the above copyright notice, this list of conditions and
the following disclaimer." Tomayto, tomahto.

<http://www.apache.org/licenses/>
I use some open source software in products I create for my customers. After I leave
they should be able to extend or alter the work I have done or the open source software it relies on,
without being required to release it as open source or advertise other organisations, not even my own.

Again, the Apache license does not say anything whatsoever about
advertising; it does not impose any restrictions or requirements that
the copyright holder be mentioned in advertising.
Don't get me wrong. I have nothing against Apache, they have some great stuff out there and I use it regularly.
If I need to fix something in their code I will supply it back to them for possible commitment in future releases.
For my customers it is a different matter. They cannot derive works of it, without complying to the notice
part of the license, or supplying it back to Apache. Something they might not be willing to do.

There is absolutely no requirement in the Apache license to submit
work back to the copyright holder, much less Apache. Wherever did you
get that notion?

It's your choice, but personally I don't see the problem with
requiring that copyright notices include the copyright information
from all copyright holders. It's hardly a marketing question, and
barely a restriction. After all, you're requiring that with the BSD
license, too. With respect to copyright attribution, there's hardly
any difference between the Apache license and the BSD or MIT licenses.

Why are you even open-sourcing your software given your concerns?
 
D

dennis

Hi Lew,

I found your response a bit strange. This were all answers to questions you yourself asked ?
Your customers are Nervous Nellies.
Um, IMHO.
No actually they are not. Some are actually very good at taking risks. The ones that make them a profit. Some of them might have legal departments that spell any license agreement. I even have customers who contractually bind you to the fact that they will not accept any liability coming from licenses.
They're not comparable at all. The BSD license referred to
advertising, as you mention; the Apache license does not. There is no
requirement in the Apache license to mention Apache in your
advertising.

Sorry, I created a bit of confusion here. I used advertise in the sense of make public. Both Apache and the old BSD license require you to make public that you use a derivitive of their work. Granted the old BSD license went much further.
They don't have to appear in the application interface,
splash screen or anywhere else but in the copyright notices.

This is what 4.4 of the Apache license says: "within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear." How would you interpret this, knowing most of my customers will not release the source code?
The BSD license is similar in this respect - "Redistributions of source code
must retain the above copyright notice, this list of conditions and
the following disclaimer." Tomayto, tomahto.
It is not that similar, because the modified BSD only refers to source code. The Apache license extends this towards generated code as well.
There is absolutely no requirement in the Apache license to submit
work back to the copyright holder, much less Apache. Wherever did you
get that notion?
I don't have the notion that this is required. I just described what I do when I alter open source code of any license. I try to give it back, so it might be included in future releases for everyone to enjoy.
Why are you even open-sourcing your software given your concerns?
I don't have any concerns. I answered your question on why I use one license instead of another. I use open source while it helps to build more efficient applications, having more time to focus on what is specific for that project. It helps to improve the more general code those projects rely on by having it used more widely, scrutenized, adapted and improved by everyone who uses it and will share.
From www.brains2b.org: "By supplying this software as open source I hope it will prevent some 'reinventing of the wheel' that still seems very customary in software development. You will find different components you can use in your own projects that will hopefully save time and effort and can be adapted for your own needs. Next to this there are some applications I have written and are just handy, or fun and would otherwise be collecting dust."

Regards,
Dennis..
 
L

Lew

dennis said:
I found your response a bit strange. This were all answers to questions you yourself asked ?

I cannot parse that question, if question it is. What are you asking?

My response was "strange" only insofar as it corrected misinformation
from your post.
This is what 4.4 of the Apache license says: "within the Source form or documentation,
if provided along with the Derivative Works; or, within a display generated by the
Derivative Works, if and wherever such third-party notices normally appear." How would
you interpret this, knowing most of my customers will not release the source code?

I "interpret this" as saying exactly what it says, "... if and
wherever such third-party notices normally appear." If your splash
screen doesn't display copyright information, Apache license doesn't
require to put all the upstream copyright notices.

This is not different from what I said, "They don't have to appear ...
anywhere else but in the copyright notices." I don't know why anyone
objects to displaying copyright notices from the copyright holder(s);
that's what copyright is for. There's a word for using copyrighted
material without acknowledging the copyright: plagiarism.
It is not that similar, because the modified BSD only refers to source code.
The Apache license extends this towards generated code as well.

No, it extends it to generated copyright notices. Nowhere does it
discuss placing copyright notices in generated code. You are
spreading misinformation about the Apache license.
I don't have the notion that this is required.

That contradicts what you said. You said,
They [your customers] cannot derive works of it, without complying to the notice
part of the license, or supplying it back to Apache.
Something they might not be willing to do.

Naturally I wondered why you thought the Apache license required that
"they ... supply... it back to Apache."
I don't have any concerns. I answered your question on why I use one license instead of another.

So despite what you said, you are not concerned that your customers
have to display the copyright notices of copyright holders?

It's hard to discuss a subject with someone who shifts their ground on
every point when a counterpoint or question is raised.

I see no purpose in discussing this with you further.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,981
Messages
2,570,187
Members
46,729
Latest member
ScarlettJe

Latest Threads

Top