Referee: The paper "Monte-Carlo methods for NLTE spectral synthesis of Supernovae" by Ergon et al. is generally well written, and deserves publication. The authors have carefully described a new non-LTE and time-dependent Monte-Carlo code, and have undertaken numerous tests to check the validity of the code. They also discuss extensively a new technique of using Markov chains to decrease the computational effort. However, I do have several concerns that need to be addressed before the paper is acceptable for publication. Ergon et al: First, a general comment on the code comparisons: The main purpose of the comparisons in the paper is to show that the JEKYLL code works as intended, and in that sense the comparison shown in the paper should be sufficient. However, the purpose is also to initiate a process of comparisons between the spectral codes used within the SN community, so we will continue to work on this topic and hope present more results in the future. The next step may be taken already this summer at a workshop at the Weissman Institute, when a number of codes (including JEKYLL, ARTIS, SUMO and CMFGEN) will be compared using a benchmark Type Ia SN model. It also worth pointing out that code-comparisons are very tedious work, and after spending more than a year on those included in the paper, we strongly argue that any additional work on this topic should go into future papers and not the current one. 1. Referee: In the introduction the authors make a number of references to MC modeling associated with SNe. However, the techniques of Lucy have been used in other areas, and some reference to this literature is warranted. For example, there is extensive work by Alex Carciofi on using MC techniques to study Be and B[e] disks. Ergon et al: Fair point. We have added references to the work by Carcofio as well to that of Long and Knigge in the first paragraph in the introduction. 2. Referee: I get a little confused regarding the MC approach and its connection to lambda iterations. Most MC modeling utilizes the Sobolev approximation, and this helps avoid issues of non-convergence in the lines. The time-dependent approach also helps to overcome problems with lambda iterations, since it takes time for regions of large (continuum) optical depth to communicate with other regions. In the static situation, ALL depths are effectively coupled. A lambda-iteration propagates information one optical depth at a time (in one sense like the MC technique), whereas for an ALO it is more like one grid point at a time. Ergon et al: Both points are fair, although the latter is a bit subtle as it applies to each individual time-step and not necessarily to the time-dependent calculation as a whole. We have added a brief discussion of these points to the second paragraph of Section 2.1. 3. Referee: In 3.1.1 the authors note that they sometimes adopt the assumption that I=S. This seems to have worked in the case they considered, but I believe there are situations where this may fail. In situations where S is dominated by a single species, the assumption I=S is equivalent to detailed balance which is known to be a bad approximation in some cases. Ergon et al: The assumption I=S used in JEKYLL is equivalent to a generalized on-the-spot approximation like the one by Avrett & Loeser (1988), and in the case where S is dominated by a single transition this is equivalent to the classical on-the-spot approximation, often referred to as detailed radiative balance. It is not entirely clear to us which cases the Referee refers to, where this is known to be a bad approximation. Clearly, any approximation have to be used with care, but using the generalized on-the-spot approximation for ground-state continua with high optical depths seems well motivated. Doing this is similar to the core-saturation method, where the frequency regions with high optical depths (where the radiative transfer is mainly local) are treated separately and approximately. Note also that Avrett & Loeser (1988) find their generalized on-the-spot approximation to give similar results as an exact solution for their H+He test model. We have added a reference to Avrett & Loeser (1988) in Sect. 3.1.1, where we point out that the assumption I=S is essentially a generalized on-the-spot-approximation. 4. Referee: I would like to see a discussion of the computation resources needed to run complex models. Ergon et al: Fair point. We have added a new sub-section at the end of Sect. 3, where we discuss this as well as the parallelization of the code, which is a closely related subject. 5. Referee: In Figure 17 they show the influence of using 2, 4 and 8 lambda iterations per time step. Of particular concern is the variation in the U band. From the figure it does not appear to have converged. To examine convergence the crucial quantity to be examined is the ratio of successive corrections, not the absolute size. If the ratio is close to 1, many more iterations may be needed to achieve convergence. *The authors should examine their convergence criteria carefully. * Ergon et al: We are not sure we understand what the Referee means here. Examining the U-band lightcurves in Figure 17 it is quite clear that the ratio of successive corrections is decreasing drastically (in terms of U-band flux). 6. Referee: Footnote 5 should be rewritten to more clearly explain what is meant. The agreement between CMFGEN and JEKYLL does not prove that the codes can accurately model SN ejecta, especially since similar atomic data was used in the test. However, it does provide a test of the radiation transport and non-LTE solution in two very different codes which use very different techniques. How well did the light curves agree? Ergon et al: Fair point. We have added a clarification to the footnote. 7. Referee: The authors note that JEKYLL uses the Sobolev approximation but CMFGEN does not. In general, I agree with the authors that the Sobolev approximation is unlikely to cause errors because of the large velocity gradients in the SN ejecta. However there is one potential case where the approximations used in the codes could lead to inaccurate answers. That occurs when the intrinsic profiles of two lines overlap, and strong scattering is occurring within one or both lines. The Sobolev approximation will not treat this interaction, while in CMFGEN the number of such interactions may be overestimated by the use of large microturbulent velocities. Ergon et al: Fair point. We have added a clarification at the end of the first paragraph of Sect. 4.3. 8. Referee: If I read the text correctly, it would be informative to add "Both codes use LTE estimates for the population and the ionization state of the gas." to Figure 3's caption. Ergon et al: Fair point. We have added this to the caption of Fig. 3. 9. Referee: Did the authors do a comparison between the results of ARTIS and JEKYLL using the "best approaches" that can be used by either code? Such a comparison would be informative, and more relevant to discussing the level of agreement/disagreement between the codes. Ergon et al: That is certainly an interesting comparison, but as discussed above, it is outside the scope of the paper. We will return to this issue in future works. 10. Referee: In the SUMO /JEKYLL comparison do the authors have an explanation why the temperature and ionization are in relatively "poor agreement" in the H and H/N zones, particularly at early times? Ergon et al: Not really, and the cause for this discrepancy could be quite cumbersome to nail down. It is worth noting, however, that contrary to JEKYLL, SUMO is rather using a classical lambda-iteration than an accelerated one, and this makes the comparisons less trustworthy at early times when the optical depths are higher. This difference is discussed in the first paragraph of Sect. 4.2, and we have added a reference to it also in the next last paragraph, where the discrepancies seen at early times are discussed. Again, as discussed above, we will continue to compare JEKYLL and SUMO, so we hope to return to this issue in future works. Also, in general, the agreement in all three comparisons is quite good, so understanding the causes of the few discrepancies seen is not really needed to show that the JEKYLL code works as intended. 11. Referee: In Figure 11 they show a comparison of the bolometric light curves computed under three different assumptions. It would be nice to see the photometric behavior in different passbands for the same set of assumptions. Ergon et al: Fair point, but again a bit outside the scope of the paper. The main purpose is to describe and test the JEKYLL code, and the application in Sect. 5 is merely provided as an example. We will return to the effect of NLTE in more depth in forthcoming papers. In addition, the effects of NLTE on the different wavelength regions are evident from the spectral comparison in Fig. 15. 12. Referee: The authors show in Figure 12 the light curves for Model 12 which was the best fit model found by J15 to SN 2011dh. Can the authors show the 2011dh data on the same plot, or is that the subject of another paper? Ergon et al: Yes, that is the subject of Paper 2, so we do not want to show any comparisons with the 2011dh data here. 14. Referee: In general, the tick marks on the axes are faint, and a little short. In many cases they could also add extra unlabelled tick-marks (e.g., Figures 12, 17). Ergon et al: Fair point. We have made the tick marks thicker and longer, and added minor unlabelled tick marks to most Figures. 15. Referee: On page 8, when discussing the diffusion approximation, the use of "outer boundary" may be a little confusing -- it is the outer boundary of the diffusion solver the inner boundary of the MC solver. Perhaps use "upper connecting boundary" or just "connecting boundary". Ergon et al: Fair point. We have replaced "inner/outer boundary" with "connecting boundary" in Sect. 3.5. 16. Referee: Please recheck all equations in the appendix for typos. I did not do a detailed check. Ergon et al: Done. A few typos found and corrected. 17. Referee: There appears to be only one reference in Section 3.3.4 (The Markov chain solution). I would think more references may be warranted. Markov chains have been used to do radiative transfer, but I am unsure as to whether the technique has been applied elsewhere in a manner similar to that described in this paper. Ergon et al: Fair point. We have explained that Markov-chains have been used for MC radiative transfer before, and as an example we have added a reference to the work on scattering in planetary atmospheres of Esposito & House (1978).